Distributed, Parallel, and Cluster Computing · Computer Science
HTC Scientific Computing in a Distributed Cloud Environment
R. Sobie, A. Agarwal, I. Gable, C. Leavett-Brown +5
2013-02-11
Performance · Computer Science
Speeding up Deep Learning with Transient Servers
Shijian Li, Robert J. Walls, Lijie Xu, Tian Guo
2019-05-07
Distributed, Parallel, and Cluster Computing · Computer Science
Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters
Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang +20
2020-10-21
Distributed, Parallel, and Cluster Computing · Computer Science
Scalability Evaluation of HPC Multi-GPU Training for ECG-based LLMs
Dimitar Mileski, Nikola Petrovski, Marjan Gusev
2025-03-28
Distributed, Parallel, and Cluster Computing · Computer Science
Federated Learning Framework for Scalable AI in Heterogeneous HPC and Cloud Environments
Sangam Ghimire, Paribartan Timalsina, Nirjal Bhurtel, Bishal Neupane +5
2025-11-26
Distributed, Parallel, and Cluster Computing · Computer Science
Characterizing and Modeling Distributed Training with Transient Cloud GPU Servers
Shijian Li, Robert J. Walls, Tian Guo
2020-04-08
Distributed, Parallel, and Cluster Computing · Computer Science
A Container-Based Workflow for Distributed Training of Deep Learning Algorithms in HPC Clusters
Jose González-Abad, Álvaro López García, Valentin Y. Kozlov
2022-11-15
Machine Learning · Computer Science
A Privacy-Preserving Cloud Architecture for Distributed Machine Learning at Scale
Vinoth Punniyamoorthy, Ashok Gadi Parthi, Mayilsamy Palanigounder, Ravi Kiran Kodali +2
2025-12-13
Distributed, Parallel, and Cluster Computing · Computer Science
Deep Learning on Operational Facility Data Related to Large-Scale Distributed Area Scientific Workflows
Alok Singh, Eric Stephan, Malachi Schram, Ilkay Altintas
2018-04-24
Machine Learning · Statistics
Incremental Learning Framework Using Cloud Computing
Kumarjit Pathak, Prabhukiran G, Jitin Kapila, Nikit Gawande
2018-05-15
Distributed, Parallel, and Cluster Computing · Computer Science
ChainerMN: Scalable Distributed Deep Learning Framework
Takuya Akiba, Keisuke Fukuda, Shuji Suzuki
2017-11-01
Machine Learning · Statistics
A Data and Model-Parallel, Distributed and Scalable Framework for Training of Deep Networks in Apache Spark
Disha Shrivastava, Santanu Chaudhury, Dr. Jayadeva
2017-08-22
Computer Vision and Pattern Recognition · Computer Science
Decentralized Diffusion Models
David McAllister, Matthew Tancik, Jiaming Song, Angjoo Kanazawa
2025-01-13
Machine Learning · Computer Science
A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks
Daniel Nichols, Siddharth Singh, Shu-Huai Lin, Abhinav Bhatele
2022-07-04
Machine Learning · Computer Science
Scalable Cross-Facility Federated Learning for Scientific Foundation Models on Multiple Supercomputers
Yijiang Li, Zilinghan Li, Kyle Chard, Ian Foster +3
2026-03-23
Machine Learning · Computer Science
GraphLab: A Distributed Framework for Machine Learning in the Cloud
Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson +1
2011-07-06
Distributed, Parallel, and Cluster Computing · Computer Science
Performance Modeling and Evaluation of Distributed Deep Learning Frameworks on GPUs
Shaohuai Shi, Qiang Wang, Xiaowen Chu
2018-08-21
Distributed, Parallel, and Cluster Computing · Computer Science
Performance of Distributed File Systems on Cloud Computing Environment: An Evaluation for Small-File Problem
Thanh Duong, Quoc Luu, Hung Nguyen
2024-01-01
Distributed, Parallel, and Cluster Computing · Computer Science
HETHUB: A Distributed Training System with Heterogeneous Cluster for Large-Scale Models
Si Xu, Zixiao Huang, Yan Zeng, Shengen Yan +10
2024-08-12
Distributed, Parallel, and Cluster Computing · Computer Science
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism
Yosuke Oyama, Naoya Maruyama, Nikoli Dryden, Erin McCarthy +5
2020-07-28
Distributed, Parallel, and Cluster Computing · Computer Science
Computron: Serving Distributed Deep Learning Models with Model Parallel Swapping
Daniel Zou, Xinchen Jin, Xueyang Yu, Hao Zhang +1
2023-06-27