English
Related papers

Related papers: Leveraging Multi-Instance GPUs through moldable ta…

200 papers

Multi-Instance GPU (MIG) is a new feature introduced by NVIDIA A100 GPUs that partitions one physical GPU into multiple GPU instances. With MIG, A100 can be the most cost-efficient GPU ever for serving Deep Neural Networks (DNNs). However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-09-24 Cheng Tan , Zhichao Li , Jian Zhang , Yu Cao , Sikai Qi , Zherui Liu , Yibo Zhu , Chuanxiong Guo

Modern GPU workloads increasingly demand efficient resource sharing, as many jobs do not require the full capacity of a GPU. Among sharing techniques, NVIDIA's Multi-Instance GPU (MIG) offers strong resource isolation by enabling…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-12-19 Hsu-Tzu Ting , Jerry Chou , Ming-Hung Chen , I-Hsin Chung

The explosive growth of AI applications has created unprecedented demand for GPU resources. Cloud providers meet this demand through GPU-as-a-Service platforms that offer rentable GPU resources for running AI workloads. In this context, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-25 Marco Zambianco , Lorenzo Fasol , Roberto Doriguzzi-Corin

Deep learning training is an expensive process that extensively uses GPUs, but not all model training saturates modern powerful GPUs. Multi-Instance GPU (MIG) is a new technology introduced by NVIDIA that can partition a GPU to better-fit…

Machine Learning · Computer Science 2023-04-25 Ties Robroek , Ehsan Yousefzadeh-Asl-Miandoab , Pınar Tözün

GPU clusters in multi-tenant settings often suffer from underutilization, making GPU-sharing technologies essential for efficient resource use. Among them, NVIDIA Multi-Instance GPU (MIG) has gained traction for providing hardware-level…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-11-14 Myeongsu Kim , Ikjun Yeom , Younghoon Kim

The extensive use of GPUs in cloud computing and the growing need for multitenancy have driven the development of innovative solutions for efficient GPU resource management. Multi-Instance GPU (MIG) technology from NVIDIA enables shared GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-02-05 Ahmad Siavashi , Mahmoud Momtazpour

Efficient power management in cloud data centers is essential for reducing costs, enhancing performance, and minimizing environmental impact. GPUs, critical for tasks like machine learning (ML) and GenAI, are major contributors to power…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-15 Tirth Vamja , Kaustabha Ray , Felix George , UmaMaheswari C Devi

New architecture GPUs like A100 are now equipped with multi-instance GPU (MIG) technology, which allows the GPU to be partitioned into multiple small, isolated instances. This technology provides more flexibility for users to support both…

Machine Learning · Computer Science 2023-01-03 Huaizheng Zhang , Yuanming Li , Wencong Xiao , Yizheng Huang , Xing Di , Jianxiong Yin , Simon See , Yong Luo , Chiew Tong Lau , Yang You

Advances in GPU compute throughput and memory capacity brings significant opportunities to a wide range of workloads. However, efficiently utilizing these resources remains challenging, particularly because diverse application…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-10 Gabin Schieffer , Ruimin Shi , Jie Ren , Ivy Peng

With the rapid advancement of Artificial Intelligence, the Graphics Processing Unit (GPU) has become increasingly essential across a growing number of safety-critical application domains. Applying a GPU is indispensable for parallel…

Operating Systems · Computer Science 2026-02-25 Yuanhai Zhang , Songyang He , Ruizhe Gou , Mingyue Cui , Boyang Li , Shuai Zhao , Kai Huang

To mitigate the increasingly common underutilization of computational resources in modern GPUs, spatial sharing methods enable multiple applications to use them simultaneously. This work presents a comprehensive evaluation of NVIDIA's…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-30 Jorge Villarrubia , Luis Costero , Francisco D. Igual , Katzalin Olcoz

GPU technology has been improving at an expedited pace in terms of size and performance, empowering HPC and AI/ML researchers to advance the scientific discovery process. However, this also leads to inefficient resource usage, as most GPU…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-10-10 Baolin Li , Tirthak Patel , Siddarth Samsi , Vijay Gadepally , Devesh Tiwari

In cloud machine learning (ML) inference systems, providing low latency to end-users is of utmost importance. However, maximizing server utilization and system throughput is also crucial for ML service providers as it helps lower the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-03-01 Yunseong Kim , Yujeong Choi , Minsoo Rhu

NVIDIA's Multi-Instance GPU (MIG) is a feature that enables system designers to reconfigure one large GPU into multiple smaller GPU slices. This work characterizes this emerging GPU and evaluates its effectiveness in designing…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-02 Gwangoo Yeo , Jiin Kim , Yujeong Choi , Minsoo Rhu

Scientific workflows are often represented as directed acyclic graphs (DAGs), where vertices correspond to tasks and edges represent the dependencies between them. Since these graphs are often large in both the number of tasks and their…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-07-15 Svetlana Kulagina , Henning Meyerhenke , Anne Benoit

NVIDIA's Multi-Instance GPU (MIG) technology enables partitioning GPU computing power and memory into separate hardware instances, providing complete isolation including compute resources, caches, and memory. However, prior work identifies…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-04-30 Bingyao Li , Yueqi Wang , Tianyu Wang , Lieven Eeckhout , Jun Yang , Aamer Jaleel , Xulong Tang

Parallel machine scheduling has been extensively studied in the past decades, with applications ranging from production planning to job processing in large computing clusters. In this work we study some of these fundamental optimization…

Data Structures and Algorithms · Computer Science 2015-09-08 Yael Mordechai

Many scientific workflows can be represented by a Directed Acyclic Graph (DAG) where each node represents a task, and there will be a directed edge between two tasks if and only if there is a dependency relationship between the two i.e. the…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-12-20 Atharva Tekawade , Suman Banerjee

Many emerging cyber-physical systems, such as autonomous vehicles and robots, rely heavily on artificial intelligence and machine learning algorithms to perform important system operations. Since these highly parallel applications are…

Distributed, Parallel, and Cluster Computing · Computer Science 2023-02-07 An Zou , Jing Li , Christopher D. Gill , Xuan Zhang

Motivated by deep neural network applications, we study the problem of scheduling splittable jobs (e.g., neural network inference tasks) on configurable machines (e.g., multi-instance GPUs). We are given $n$ jobs and a set $C$ of…

Data Structures and Algorithms · Computer Science 2023-12-12 Matthew Casey , Rajmohan Rajaraman , David Stalfa
‹ Prev 1 2 3 10 Next ›