Machine Learning · Computer Science
Galvatron: Efficient Transformer Training over Multiple GPUs Using Automatic Parallelism
Xupeng Miao, Yujie Wang, Youhe Jiang, Chunan Shi +3
2022-11-28
Machine Learning · Computer Science
SATURN: SAT-based Reinforcement Learning to Unleash LLMs Reasoning
Huanyu Liu, Ge Li, Jia Li, Hao Zhu +2
2026-03-11
Distributed, Parallel, and Cluster Computing · Computer Science
Distributed Training Large-Scale Deep Architectures
Shang-Xuan Zou, Chun-Yen Chen, Jui-Lin Wu, Chun-Nan Chou +5
2017-09-21
Artificial Intelligence · Computer Science
Saturn Platform: Foundation Model Operations and Generative AI for Financial Services
Antonio J. G. Busson, Rennan Gaio, Rafael H. Rocha, Francisco Evangelista +5
2023-12-14
Machine Learning · Computer Science
Large-Scale Deep Learning Optimizations: A Comprehensive Survey
Xiaoxin He, Fuzhao Xue, Xiaozhe Ren, Yang You
2021-11-03
Distributed, Parallel, and Cluster Computing · Computer Science
Arena: Efficiently Training Large Models via Dynamic Scheduling and Adaptive Parallelism Co-Design
Chunyu Xue, Weihao Cui, Quan Chen, Chen Chen +9
2026-03-25
Distributed, Parallel, and Cluster Computing · Computer Science
COMET: A Comprehensive Cluster Design Methodology for Distributed Deep Learning Training
Divya Kiran Kadiyala, Saeed Rashidi, Taekyung Heo, Abhimanyu Rajeshkumar Bambhaniya +2
2024-03-15
Distributed, Parallel, and Cluster Computing · Computer Science
Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
Xinyi Liu, Yujie Wang, Shenhan Zhu, Fangcheng Fu +3
2025-05-01
Distributed, Parallel, and Cluster Computing · Computer Science
Decentralized Training of Foundation Models in Heterogeneous Environments
Binhang Yuan, Yongjun He, Jared Quincy Davis, Tianyi Zhang +5
2023-06-22
Information Retrieval · Computer Science
Deep Learning Model Acceleration and Optimization Strategies for Real-Time Recommendation Systems
Junli Shao, Jing Dong, Dingzhou Wang, Kowei Shih +2
2025-08-14
Machine Learning · Computer Science
Improving Automatic Parallel Training via Balanced Memory Workload Optimization
Yujie Wang, Youhe Jiang, Xupeng Miao, Fangcheng Fu +4
2024-09-06
Distributed, Parallel, and Cluster Computing · Computer Science
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan, Yi Rong, Chen Meng, Zongyan Cao +9
2020-07-03
Distributed, Parallel, and Cluster Computing · Computer Science
The Case for Co-Designing Model Architectures with Hardware
Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman +5
2024-02-01
Distributed, Parallel, and Cluster Computing · Computer Science
LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism
Diandian Gu, Peng Sun, Qinghao Hu, Ting Huang +10
2024-06-27
Machine Learning · Computer Science
SONAR: Joint Architecture and System Optimization Search
Elias Jääsaari, Michelle Ma, Ameet Talwalkar, Tianqi Chen
2022-08-26
Distributed, Parallel, and Cluster Computing · Computer Science
VELTAIR: Towards High-Performance Multi-tenant Deep Learning Services via Adaptive Compilation and Scheduling
Zihan Liu, Jingwen Leng, Zhihui Zhang, Quan Chen +2
2022-01-19
Distributed, Parallel, and Cluster Computing · Computer Science
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
Max Ryabinin, Tim Dettmers, Michael Diskin, Alexander Borzunov
2023-06-30
Machine Learning · Computer Science
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
Shenggui Li, Hongxin Liu, Zhengda Bian, Jiarui Fang +4
2023-10-06
Distributed, Parallel, and Cluster Computing · Computer Science
ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment
Xiaofeng Wu, Jia Rao, Wei Chen
2024-03-18