English
Related papers

Related papers: JanusPipe: Efficient Pipeline Parallel Training fo…

200 papers

Pipeline parallelism is a crucial paradigm for large-scale model training. However, imbalances in memory footprint across stages can lead to significant GPU memory wastage, limiting the model sizes that pipeline parallelism can effectively…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-12 Xuan Peng , Xuanhua Shi , Haolin Zhang , Yunfei Zhao , Xuehai Qian

Machine learning interatomic potentials (MLIPs) have become powerful tools to extend molecular simulations beyond the limits of quantum methods, offering near-quantum accuracy at much lower computational cost. Yet, developing reliable MLIPs…

Materials Science · Physics 2025-12-30 Adam Lahouari , Jutta Rogal , Mark E. Tuckerman

We present JaxPP, a system for efficiently scaling the training of large deep learning models with flexible pipeline parallelism. We introduce a seamless programming model that allows implementing user-defined pipeline schedules for…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-12-20 Anxhelo Xhebraj , Sean Lee , Hanfeng Chen , Vinod Grover

Pipeline Parallelism (PP) serves as a crucial technique for training Large Language Models (LLMs), owing to its capability to alleviate memory pressure from model states with relatively low communication overhead. However, in long-context…

Machine Learning · Computer Science 2025-04-22 Zhouyang Li , Yuliang Liu , Wei Zhang , Tailing Yuan , Bin Chen , Chengru Song , Di Zhang

With the increasing scale of models, the need for efficient distributed training has become increasingly urgent. Recently, many synchronous pipeline parallelism approaches have been proposed to improve training throughput. However, these…

Machine Learning · Computer Science 2024-10-28 Houming Wu , Ling Chen , Wenjie Yu

The size of deep neural networks (DNNs) grows rapidly as the complexity of the machine learning algorithm increases. To satisfy the requirement of computation and memory of DNN training, distributed deep learning based on model parallelism…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-01-15 Letian Zhao , Rui Xu , Tianqi Wang , Teng Tian , Xiaotian Wang , Wei Wu , Chio-in Ieong , Xi Jin

Larger model sizes and longer sequence lengths have empowered the Large Language Model (LLM) to achieve outstanding performance across various domains. However, this progress brings significant storage capacity challenges for LLM…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-08-26 Xinyuan Lin , Chenlu Li , Zongle Huang , Chunyu Wang , Bo Xiao , Huazhong Yang , Shishi Duan , Yongpan Liu

Machine Learning Interatomic Potentials (MLIP) are a novel in silico approach for molecular property prediction, creating an alternative to disrupt the accuracy/speed trade-off of empirical force fields and density functional theory (DFT).…

The need to use a short time step is a key limit on the speed of molecular dynamics (MD) simulations. Simulations governed by classical potentials are often accelerated by using a multiple-time-step (MTS) integrator that evaluates certain…

Chemical Physics · Physics 2023-10-24 Xiang Fu , Albert Musaelian , Anders Johansson , Tommi Jaakkola , Boris Kozinsky

Pipeline parallelism is widely used to train large language models (LLMs). However, increasing heterogeneity in model architectures exacerbates pipeline bubbles, thereby reducing training efficiency. Existing approaches overlook the…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-09-30 Jihu Guo , Tenghui Ma , Wei Gao , Peng Sun , Jiaxing Li , Xun Chen , Yuyang Jin , Dahua Lin

Machine learning interatomic potentials (MLIPs) enable efficient molecular dynamics (MD) simulations with ab initio accuracy and have been applied across various domains in physical science. However, their performance often relies on…

Computational Physics · Physics 2025-07-29 Taoyong Cui , Zhongyao Wang , Dongzhan Zhou , Yuqiang Li , Lei Bai , Wanli Ouyang , Mao Su , Shufei Zhang

Machine learned interatomic potentials (MLIPs) have emerged as powerful tools for molecular dynamics (MD) simulations with their competitive accuracy and computational efficiency. However, MLIPs are often observed to exhibit un-physical…

Materials Science · Physics 2026-02-24 Qianyu Zheng , Victor Fung

Large-scale atomistic simulations are essential to bridge computational materials and chemistry to realistic materials and drug discovery applications. In the past few years, rapid developments of machine learning interatomic potentials…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-02-03 Kevin Han , Bowen Deng , Amir Barati Farimani , Gerbrand Ceder

It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-03 Shiqing Fan , Yi Rong , Chen Meng , Zongyan Cao , Siyu Wang , Zhen Zheng , Chuan Wu , Guoping Long , Jun Yang , Lixue Xia , Lansong Diao , Xiaoyong Liu , Wei Lin

Machine learning interatomic potentials (MLIPs) enables molecular dynamics (MD) simulations with ab initio accuracy and has been applied to various fields of physical science. However, the performance and transferability of MLIPs are…

Chemical Physics · Physics 2024-04-16 Taoyong Cui , Chenyu Tang , Mao Su , Shufei Zhang , Yuqiang Li , Lei Bai , Yuhan Dong , Xingao Gong , Wanli Ouyang

Model parallelism has become a necessity for training modern large-scale deep language models. In this work, we identify a new and orthogonal dimension from existing model parallel approaches: it is possible to perform pipeline parallelism…

Machine Learning · Computer Science 2021-09-29 Zhuohan Li , Siyuan Zhuang , Shiyuan Guo , Danyang Zhuo , Hao Zhang , Dawn Song , Ion Stoica

The increasing demand for intelligent mobile applications has made multi-agent collaboration with Transformer-based large language models (LLMs) essential in mobile edge computing (MEC) networks. However, training LLMs in such environments…

Systems and Control · Electrical Eng. & Systems 2025-09-25 Jiewei Chen , Xiumei Deng , Zehui Xiong , Shaoyong Guo , Xuesong Qiu , Ping Wang , Dusit Niyato

Universal Machine Learning Interatomic Potentials (uMLIPs), pre-trained on massively diverse datasets encompassing inorganic materials and organic molecules across the entire periodic table, serve as foundational models for quantum-accurate…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-04-20 Yuanchang Zhou , Hongyu Wang , Yiming Du , Yan Wang , Mingzhen Li , Siyu Hu , Xiangyu Zhang , Weijian Liu , Chen Wang , Zhuoqiang Guo , Long Wang , Jingde Bu , Yutong Lu , Guangming Tan , Weile Jia

Machine learning interatomic potentials (MLIPs) enable more efficient molecular dynamics (MD) simulations with ab initio accuracy, which have been used in various domains of physical science. However, distribution shift between training and…

Computational Physics · Physics 2024-05-15 Taoyong Cui , Chenyu Tang , Dongzhan Zhou , Yuqiang Li , Xingao Gong , Wanli Ouyang , Mao Su , Shufei Zhang

As inference workloads for large language models (LLMs) scale to meet growing user demand, pipeline parallelism (PP) has become a widely adopted strategy for multi-GPU deployment, particularly in cross-node setups, to improve key-value (KV)…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-06-30 Yongchao He , Bohan Zhao , Zheng Cao
‹ Prev 1 2 3 10 Next ›