Related papers: Accelerating Data Loading in Deep Neural Network T…

Importance of Data Loading Pipeline in Training Deep Neural Networks

Training large-scale deep neural networks is a long, time-consuming operation, often requiring many GPUs to accelerate. In large models, the time spent loading data takes a significant portion of model training time. As GPU servers are…

Computer Vision and Pattern Recognition · Computer Science 2020-05-06 Mahdi Zolnouri , Xinlin Li , Vahid Partovi Nia

Distributed Optimization using Heterogeneous Compute Systems

Hardware compute power has been growing at an unprecedented rate in recent years. The utilization of such advancements plays a key role in producing better results in less time -- both in academia and industry. However, merging the existing…

Machine Learning · Computer Science 2021-10-19 Vineeth S

BLoad: Enhancing Neural Network Training with Efficient Sequential Data Handling

The increasing complexity of modern deep neural network models and the expanding sizes of datasets necessitate the development of optimized and scalable training methods. In this white paper, we addressed the challenge of efficiently…

Machine Learning · Computer Science 2024-04-29 Raphael Ruschel , A. S. M. Iftekhar , B. S. Manjunath , Suya You

Distributed learning of deep neural network over multiple agents

In domains such as health care and finance, shortage of labeled data and computational resources is a critical issue while developing machine learning algorithms. To address the issue of labeled data scarcity in training and deployment of…

Machine Learning · Computer Science 2018-10-16 Otkrist Gupta , Ramesh Raskar

Reducing Data Motion to Accelerate the Training of Deep Neural Networks

This paper reduces the cost of DNNs training by decreasing the amount of data movement across heterogeneous architectures composed of several GPUs and multicore CPU devices. In particular, this paper proposes an algorithm to dynamically…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-04-07 Sicong Zhuang , Cristiano Malossi , Marc Casas

Accelerating Neural Network Training with Distributed Asynchronous and Selective Optimization (DASO)

With increasing data and model complexities, the time required to train neural networks has become prohibitively large. To address the exponential rise in training time, users are turning to data parallel neural networks (DPNN) to utilize…

Machine Learning · Computer Science 2022-02-09 Daniel Coquelin , Charlotte Debus , Markus Götz , Fabrice von der Lehr , James Kahn , Martin Siggel , Achim Streit

Data optimization for large batch distributed training of deep neural networks

Distributed training in deep learning (DL) is common practice as data and models grow. The current practice for distributed training of deep neural networks faces the challenges of communication bottlenecks when operating at scale, and…

Machine Learning · Computer Science 2020-12-21 Shubhankar Gahlot , Junqi Yin , Mallikarjun Shankar

Optimizing Distributed Training Approaches for Scaling Neural Networks

This paper presents a comparative analysis of distributed training strategies for large-scale neural networks, focusing on data parallelism, model parallelism, and hybrid approaches. We evaluate these strategies on image classification…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-01 Vishnu Vardhan Baligodugula , Fathi Amsaad

An introduction to distributed training of deep neural networks for segmentation tasks with large seismic datasets

Deep learning applications are drastically progressing in seismic processing and interpretation tasks. However, the majority of approaches subsample data volumes and restrict model sizes to minimise computational requirements. Subsampling…

Geophysics · Physics 2021-02-26 Claire Birnie , Haithem Jarraya , Fredrik Hansteen

Accelerating DNN Training in Wireless Federated Edge Learning Systems

Training task in classical machine learning models, such as deep neural networks, is generally implemented at a remote cloud center for centralized learning, which is typically time-consuming and resource-hungry. It also incurs serious…

Machine Learning · Computer Science 2020-10-27 Jinke Ren , Guanding Yu , Guangyao Ding

STANNIS: Low-Power Acceleration of Deep Neural Network Training Using Computational Storage

This paper proposes a framework for distributed, in-storage training of neural networks on clusters of computational storage devices. Such devices not only contain hardware accelerators but also eliminate data movement between the host and…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-14 Ali HeydariGorji , Mahdi Torabzadehkashi , Siavash Rezaei , Hossein Bobarshad , Vladimir Alves , Pai H. Chou

Distributed Training of Deep Neural Networks with Theoretical Analysis: Under SSP Setting

We propose a distributed approach to train deep neural networks (DNNs), which has guaranteed convergence theoretically and great scalability empirically: close to 6 times faster on instance of ImageNet data set when run with 6 machines. The…

Machine Learning · Statistics 2016-10-04 Abhimanu Kumar , Pengtao Xie , Junming Yin , Eric P. Xing

Distributed Training Large-Scale Deep Architectures

Scale of data and scale of computation infrastructures together enable the current deep learning renaissance. However, training large-scale deep architectures demands both algorithmic improvement and careful system configuration. In this…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-21 Shang-Xuan Zou , Chun-Yen Chen , Jui-Lin Wu , Chun-Nan Chou , Chia-Chin Tsao , Kuan-Chieh Tung , Ting-Wei Lin , Cheng-Lung Sung , Edward Y. Chang

Distributed Deep Learning using Stochastic Gradient Staleness

Despite the notable success of deep neural networks (DNNs) in solving complex tasks, the training process still remains considerable challenges. A primary obstacle is the substantial time required for training, particularly as high…

Machine Learning · Computer Science 2025-09-09 Viet Hoang Pham , Hyo-Sung Ahn

Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

With rapidly increasing distributed deep learning workloads in large-scale data centers, efficient distributed deep learning framework strategies for resource allocation and workload scheduling have become the key to high-performance deep…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-06-13 Feng Liang , Zhen Zhang , Haifeng Lu , Chengming Li , Victor C. M. Leung , Yanyi Guo , Xiping Hu

Large-Scale Stochastic Learning using GPUs

In this work we propose an accelerated stochastic learning system for very large-scale applications. Acceleration is achieved by mapping the training algorithm onto massively parallel processors: we demonstrate a parallel, asynchronous GPU…

Machine Learning · Computer Science 2017-02-24 Thomas Parnell , Celestine Dünner , Kubilay Atasu , Manolis Sifalakis , Haris Pozidis

Big Data Intelligence Using Distributed Deep Neural Networks

Large amount of data is often required to train and deploy useful machine learning models in industry. Smaller enterprises do not have the luxury of accessing enough data for machine learning, For privacy sensitive fields such as banking,…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-09-05 Felix Ongati , Eng. Lawrence Muchemi

Towards Scalable Distributed Training of Deep Learning on Public Cloud Clusters

Distributed training techniques have been widely deployed in large-scale deep neural networks (DNNs) training on dense-GPU clusters. However, on public cloud clusters, due to the moderate inter-connection bandwidth between instances,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-10-21 Shaohuai Shi , Xianhao Zhou , Shutao Song , Xingyao Wang , Zilin Zhu , Xue Huang , Xinan Jiang , Feihu Zhou , Zhenyu Guo , Liqiang Xie , Rui Lan , Xianbin Ouyang , Yan Zhang , Jieqian Wei , Jing Gong , Weiliang Lin , Ping Gao , Peng Meng , Xiaomin Xu , Chenyang Guo , Bo Yang , Zhibo Chen , Yongjian Wu , Xiaowen Chu

Optimizing Task Placement and Online Scheduling for Distributed GNN Training Acceleration

Training Graph Neural Networks (GNN) on large graphs is resource-intensive and time-consuming, mainly due to the large graph data that cannot be fit into the memory of a single machine, but have to be fetched from distributed graph storage…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-23 Ziyue Luo , Yixin Bao , Chuan Wu

Adaptive Data Dropout: Towards Self-Regulated Learning in Deep Neural Networks

Deep neural networks are typically trained by uniformly sampling large datasets across epochs, despite evidence that not all samples contribute equally throughout learning. Recent work shows that progressively reducing the amount of…

Machine Learning · Computer Science 2026-04-15 Amar Gahir , Varshil Patel , Shreyank N Gowda