English
Related papers

Related papers: A Multi-Objective Framework for Optimizing GPU-Ena…

200 papers

The explosive growth of AI applications has created unprecedented demand for GPU resources. Cloud providers meet this demand through GPU-as-a-Service platforms that offer rentable GPU resources for running AI workloads. In this context, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-25 Marco Zambianco , Lorenzo Fasol , Roberto Doriguzzi-Corin

Modern GPU workloads increasingly demand efficient resource sharing, as many jobs do not require the full capacity of a GPU. Among sharing techniques, NVIDIA's Multi-Instance GPU (MIG) offers strong resource isolation by enabling…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-19 Hsu-Tzu Ting , Jerry Chou , Ming-Hung Chen , I-Hsin Chung

There is an urgent and pressing need to optimize usage of Graphical Processing Units (GPUs), which have arguably become one of the most expensive and sought after IT resources. To help with this goal, several of the current generation of…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-11 Bekir Turkkan , Pavankumar Murali , Pavithra Harsha , Rohan Arora , Gerard Vanloo , Chandra Narayanaswami

GPU clusters in multi-tenant settings often suffer from underutilization, making GPU-sharing technologies essential for efficient resource use. Among them, NVIDIA Multi-Instance GPU (MIG) has gained traction for providing hardware-level…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-14 Myeongsu Kim , Ikjun Yeom , Younghoon Kim

Efficient power management in cloud data centers is essential for reducing costs, enhancing performance, and minimizing environmental impact. GPUs, critical for tasks like machine learning (ML) and GenAI, are major contributors to power…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-15 Tirth Vamja , Kaustabha Ray , Felix George , UmaMaheswari C Devi

Deep learning training is an expensive process that extensively uses GPUs, but not all model training saturates modern powerful GPUs. Multi-Instance GPU (MIG) is a new technology introduced by NVIDIA that can partition a GPU to better-fit…

Machine Learning · Computer Science 2023-04-25 Ties Robroek , Ehsan Yousefzadeh-Asl-Miandoab , Pınar Tözün

Advances in GPU compute throughput and memory capacity brings significant opportunities to a wide range of workloads. However, efficiently utilizing these resources remains challenging, particularly because diverse application…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-10 Gabin Schieffer , Ruimin Shi , Jie Ren , Ivy Peng

GPU technology has been improving at an expedited pace in terms of size and performance, empowering HPC and AI/ML researchers to advance the scientific discovery process. However, this also leads to inefficient resource usage, as most GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-10 Baolin Li , Tirthak Patel , Siddarth Samsi , Vijay Gadepally , Devesh Tiwari

NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power and memory into separate hardware instances, providing complete isolation including compute resources, caches, and memory. However, prior work identifies…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-30 Bingyao Li , Yueqi Wang , Tianyu Wang , Lieven Eeckhout , Jun Yang , Aamer Jaleel , Xulong Tang

NVIDIA MIG (Multi-Instance GPU) allows partitioning a physical GPU into multiple logical instances with fully-isolated resources, which can be dynamically reconfigured. This work highlights the untapped potential of MIG through moldable…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-07-21 Jorge Villarrubia , Luis Costero , Francisco D. Igual , Katzalin Olcoz

Multi-Instance GPU (MIG) is a new feature introduced by NVIDIA A100 GPUs that partitions one physical GPU into multiple GPU instances. With MIG, A100 can be the most cost-efficient GPU ever for serving Deep Neural Networks (DNNs). However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-24 Cheng Tan , Zhichao Li , Jian Zhang , Yu Cao , Sikai Qi , Zherui Liu , Yibo Zhu , Chuanxiong Guo

The proliferation of GPU-accelerated workloads, particularly in artificial intelligence and large language model (LLM) inference, has created unprecedented demand for efficient GPU resource sharing in cloud and container environments. While…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-30 Jithin VG , Ditto PS

Continuous learning (CL) has emerged as one of the most popular deep learning paradigms deployed in modern cloud GPUs. Specifically, CL has the capability to continuously update the model parameters (through model retraining) and use the…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-19 Tianyu Wang , Sheng Li , Bingyao Li , Yue Dai , Ao Li , Geng Yuan , Yufei Ding , Youtao Zhang , Xulong Tang

In cloud machine learning (ML) inference systems, providing low latency to end-users is of utmost importance. However, maximizing server utilization and system throughput is also crucial for ML service providers as it helps lower the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-01 Yunseong Kim , Yujeong Choi , Minsoo Rhu

The High Performance Computing (HPC) field is witnessing a widespread adoption of Graphics Processing Units (GPUs) as co-processors for conventional homogeneous clusters. The adoption of prevalent Single- Program Multiple-Data (SPMD)…

Distributed, Parallel, and Cluster Computing · Computer Science 2015-11-25 Teng Li , Vikram K. Narayana , Tarek El-Ghazawi

CPU-GPU heterogeneous systems are now commonly used in HPC (High-Performance Computing). However, improving the utilization and energy-efficiency of such systems is still one of the most critical issues. As one single program typically…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-05-08 Eishi Arima , Minjoon Kang , Issa Saba , Josef Weidendorfer , Carsten Trinitis , Martin Schulz

In cloud environments, GPU-based deep neural network (DNN) inference servers are required to meet the Service Level Objective (SLO) latency for each workload under a specified request rate, while also minimizing GPU resource consumption.…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-24 Munkyu Lee , Sihoon Seong , Minki Kang , Jihyuk Lee , Gap-Joo Na , In-Geol Chun , Dimitrios Nikolopoulos , Cheol-Ho Hong

GPUs are vastly underutilized, even when running resource-intensive AI applications, as GPU kernels within each job have diverse resource profiles that may saturate some parts of a device while often leaving other parts idle. Colocating…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-17 Paul Elvinger , Foteini Strati , Natalie Enright Jerger , Ana Klimovic

To facilitate cost-effective and elastic computing benefits to the cloud users, the energy-efficient and secure allocation of virtual machines (VMs) plays a significant role at the data centre. The inefficient VM Placement (VMP) and sharing…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-07-29 Deepika Saxena , Ishu Gupta , Jitendra Kumar , Ashutosh Kumar Singh , Xiaoqing Wen

Cloud computing provides a computing platform for the users to meet their demands in an efficient, cost-effective way. Virtualization technologies are used in the clouds to aid the efficient usage of hardware. Virtual machines (VMs) are…

Distributed, Parallel, and Cluster Computing · Computer Science 2010-11-24 Umesh Bellur , Chetan S Rao , Madhu Kumar SD
‹ Prev 1 2 3 10 Next ›