English

GPU Sharing with Triples Mode

Distributed, Parallel, and Cluster Computing 2025-04-08 v1

Abstract

There is a tremendous amount of interest in AI/ML technologies due to the proliferation of generative AI applications such as ChatGPT. This trend has significantly increased demand on GPUs, which are the workhorses for training AI models. Due to the high costs of GPUs and lacking supply, it has become of interest to optimize GPU usage in HPC centers. MIT Lincoln Laboratory Supercomputing Center (LLSC) has developed an easy-to-use GPU sharing feature supported by LLSC-developed tools including LLsub and LLMapReduce. This approach overcomes some of the limitations with the existing methods for GPU sharing. This allows users to apply GPU sharing whenever possible while they are developing their AI/ML models and/or doing parametric study on their AI models or executing other GPU applications. Based on our initial experimental results with GPU sharing, GPU sharing with triples mode is easy to use and achieved significant improvement in GPU usage and throughput performance for certain types of AI applications.

Keywords

Cite

@article{arxiv.2410.22254,
  title  = {GPU Sharing with Triples Mode},
  author = {Chansup Byun and Albert Reuther and LaToya Anderson and William Arcand and Bill Bergeron and David Bestor and Alexander Bonn and Daniel Burrill and Vijay Gadepally and Michael Houle and Matthew Hubbell and Hayden Jananthan and Michael Jones and Piotr Luszczek and Peter Michaleas and Lauren Milechin and Guillermo Morales and Julie Mullen and Andrew Prout and Antonio Rosa and Charles Yee and Jeremy Kepner},
  journal= {arXiv preprint arXiv:2410.22254},
  year   = {2025}
}
R2 v1 2026-06-28T19:39:57.503Z