Machine Learning · Computer Science
Automatic Operator-level Parallelism Planning for Distributed Deep Learning -- A Mixed-Integer Programming Approach
Ruifeng She, Bowen Pang, Kai Li, Zehua Liu +1
2025-03-13
Machine Learning · Computer Science
CHOPT : Automated Hyperparameter Optimization Framework for Cloud-Based Machine Learning Platforms
Jinwoong Kim, Minkyu Kim, Heungseok Park, Ernar Kusdavletov +5
2018-10-17
Distributed, Parallel, and Cluster Computing · Computer Science
Using Meta-heuristics and Machine Learning for Software Optimization of Parallel Computing Systems: A Systematic Literature Review
Suejb Memeti, Sabri Pllana, Alecio Binotto, Joanna Kolodziej +1
2018-05-03
Distributed, Parallel, and Cluster Computing · Computer Science
Collaborative Cluster Configuration for Distributed Data-Parallel Processing: A Research Overview
Lauritz Thamsen, Dominik Scheinert, Jonathan Will, Jonathan Bader +1
2022-06-02
Machine Learning · Computer Science
Orchestra: Unsupervised Federated Learning via Globally Consistent Clustering
Ekdeep Singh Lubana, Chi Ian Tang, Fahim Kawsar, Robert P. Dick +1
2022-06-14
Distributed, Parallel, and Cluster Computing · Computer Science
Online Job Scheduling in Distributed Machine Learning Clusters
Yixin Bao, Yanghua Peng, Chuan Wu, Zongpeng Li
2018-01-04
Machine Learning · Computer Science
Exploiting Parallelism Opportunities with Deep Learning Frameworks
Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim Hazelwood +1
2020-07-01
Machine Learning · Computer Science
Enhancing Multi-Objective Optimization through Machine Learning-Supported Multiphysics Simulation
Diego Botache, Jens Decke, Winfried Ripken, Abhinay Dornipati +3
2024-04-04
Machine Learning · Computer Science
Assisted Learning for Organizations with Limited Imbalanced Data
Cheng Chen, Jiaying Zhou, Jie Ding, Yi Zhou
2024-03-05
Distributed, Parallel, and Cluster Computing · Computer Science
Bi-objective Optimisation of Data-parallel Applications on Heterogeneous Platforms for Performance and Energy via Workload Distribution
Hamidreza Khaleghzadeh, Muhammad Fahad, Arsalan Shahid, Ravi Reddy Manumachu +1
2019-07-10
Distributed, Parallel, and Cluster Computing · Computer Science
HARP: Orchestrating Automated Parallel Training on Heterogeneous GPU Clusters
Antian Liang, Zhigang Zhao, Kai Zhang, Xuri Shi +5
2026-05-05
Machine Learning · Computer Science
ShadowSync: Performing Synchronization in the Background for Highly Scalable Distributed Training
Qinqing Zheng, Bor-Yiing Su, Jiyan Yang, Alisson Azzolini +6
2021-02-24
Machine Learning · Computer Science
Machine Learning-based Orchestration of Containers: A Taxonomy and Future Directions
Zhiheng Zhong, Minxian Xu, Maria Alejandra Rodriguez, Chengzhong Xu +1
2021-08-23
Distributed, Parallel, and Cluster Computing · Computer Science
Workflow-Driven Modeling for the Compute Continuum: An Optimization Approach to Automated System and Workload Scheduling
Aasish Kumar Sharma, Christian Boehme, Patrick Gelß, Ramin Yahyapour +1
2025-05-20
Distributed, Parallel, and Cluster Computing · Computer Science
Scaling Studies for Efficient Parameter Search and Parallelism for Large Language Model Pre-training
Michael Benington, Leo Phan, Chris Pierre Paul, Evan Shoemaker +4
2023-10-12
Distributed, Parallel, and Cluster Computing · Computer Science
Distributed Training Large-Scale Deep Architectures
Shang-Xuan Zou, Chun-Yen Chen, Jui-Lin Wu, Chun-Nan Chou +5
2017-09-21
Distributed, Parallel, and Cluster Computing · Computer Science
Service Orchestration in the Computing Continuum: Structural Challenges and Vision
Boris Sedlak, Víctor Casamayor Pujol, Ildefons Magrans de Abril, Praveen Kumar Donta +2
2026-02-18
Machine Learning · Computer Science
A Survey and Empirical Evaluation of Parallel Deep Learning Frameworks
Daniel Nichols, Siddharth Singh, Shu-Huai Lin, Abhinav Bhatele
2022-07-04
Machine Learning · Computer Science
Hybrid Learning for Orchestrating Deep Learning Inference in Multi-user Edge-cloud Networks
Sina Shahhosseini, Tianyi Hu, Dongjoo Seo, Anil Kanduri +3
2022-02-24
Distributed, Parallel, and Cluster Computing · Computer Science
Toward Efficient Online Scheduling for Distributed Machine Learning Systems
Menglu Yu, Jia Liu, Chuan Wu, Bo Ji +1
2022-05-16
Distributed, Parallel, and Cluster Computing · Computer Science
Efficient Pipeline Planning for Expedited Distributed DNN Training
Ziyue Luo, Xiaodong Yi, Guoping Long, Shiqing Fan +3
2022-08-23