English
Related papers

Related papers: Priority-based Parameter Propagation for Distribut…

200 papers

To train modern large DNN models, pipeline parallelism has recently emerged, which distributes the model across GPUs and enables different devices to process different microbatches in pipeline. Earlier pipeline designs allow multiple…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-23 Ziyue Luo , Xiaodong Yi , Guoping Long , Shiqing Fan , Chuan Wu , Jun Yang , Wei Lin

Distributed training of deep neural networks has received significant research interest, and its major approaches include implementations on multiple GPUs and clusters. Parallelization can dramatically improve the efficiency of training…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-11-29 Jaehee Jang , Byungook Na , Sungroh Yoon

Distributed deep neural network (DDNN) training constitutes an increasingly important workload that frequently runs in the cloud. Larger DNN models and faster compute engines are shifting DDNN training bottlenecks from computation to…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-22 Liang Luo , Jacob Nelson , Luis Ceze , Amar Phanishayee , Arvind Krishnamurthy

Deep neural networks (DNN) have achieved remarkable success in various fields, including computer vision and natural language processing. However, training an effective DNN model still poses challenges. This paper aims to propose a method…

Machine Learning · Computer Science 2024-07-03 Hejie Ying , Mengmeng Song , Yaohong Tang , Shungen Xiao , Zimin Xiao

Training a deep neural network (DNN) requires substantial computational and memory requirements. It is common to use multiple devices to train a DNN to reduce the overall training time. There are several choices to parallelize each layer in…

Machine Learning · Computer Science 2024-07-08 Venmugil Elango

Parameter updating is an important stage in parallelism-based distributed deep learning. Synchronous methods are widely used in distributed training the Deep Neural Networks (DNNs). To reduce the communication and synchronization overhead…

Machine Learning · Computer Science 2020-09-09 Qing Ye , Yuxuan Han , Yanan sun , JIancheng Lv

It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However,…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-07-03 Shiqing Fan , Yi Rong , Chen Meng , Zongyan Cao , Siyu Wang , Zhen Zheng , Chuan Wu , Guoping Long , Jun Yang , Lixue Xia , Lansong Diao , Xiaoyong Liu , Wei Lin

This paper presents a comparative analysis of distributed training strategies for large-scale neural networks, focusing on data parallelism, model parallelism, and hybrid approaches. We evaluate these strategies on image classification…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-04-01 Vishnu Vardhan Baligodugula , Fathi Amsaad

Data parallelism has emerged as a necessary technique to accelerate the training of deep neural networks (DNN). In a typical data parallelism approach, the local workers push the latest updates of all the parameters to the parameter server…

Distributed, Parallel, and Cluster Computing · Computer Science 2017-09-28 Shizhao Sun , Wei Chen , Jiang Bian , Xiaoguang Liu , Tie-Yan Liu

We propose a new integrated method of exploiting model, batch and domain parallelism for the training of deep neural networks (DNNs) on large distributed-memory computers using minibatch stochastic gradient descent (SGD). Our goal is to…

Machine Learning · Computer Science 2018-05-17 Amir Gholami , Ariful Azad , Peter Jin , Kurt Keutzer , Aydin Buluc

The past few years have witnessed the flourishing of large-scale deep neural network models with ever-growing parameter numbers. Training such large-scale models typically requires massive memory and computing resources, necessitating…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-08-30 Yunze Wei , Tianshuo Hu , Cong Liang , Yong Cui

Recently, several optimization methods have been successfully applied to the hyperparameter optimization of deep neural networks (DNNs). The methods work by modeling the joint distribution of hyperparameter values and corresponding error.…

Machine Learning · Computer Science 2016-08-02 Ilija Ilievski , Jiashi Feng

Despite the notable success of deep neural networks (DNNs) in solving complex tasks, the training process still remains considerable challenges. A primary obstacle is the substantial time required for training, particularly as high…

Machine Learning · Computer Science 2025-09-09 Viet Hoang Pham , Hyo-Sung Ahn

Parallel training of neural networks at scale is challenging due to significant overheads arising from communication. Recently, deep learning researchers have developed a variety of pruning algorithms that are capable of pruning (i.e.…

Machine Learning · Computer Science 2023-05-16 Siddharth Singh , Abhinav Bhatele

Most work in the deep learning systems community has focused on faster inference, but arriving at a trained model requires lengthy experiments. Accelerating training lets developers iterate faster and come up with better models. DNN…

Distributed, Parallel, and Cluster Computing · Computer Science 2020-01-22 Liang Luo , Jacob Nelson , Luis Ceze , Amar Phanishayee , Arvind Krishnamurthy

Training Graph Neural Networks (GNN) on large graphs is resource-intensive and time-consuming, mainly due to the large graph data that cannot be fit into the memory of a single machine, but have to be fetched from distributed graph storage…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-08-23 Ziyue Luo , Yixin Bao , Chuan Wu

With increasing data and model complexities, the time required to train neural networks has become prohibitively large. To address the exponential rise in training time, users are turning to data parallel neural networks (DPNN) to utilize…

Machine Learning · Computer Science 2022-02-09 Daniel Coquelin , Charlotte Debus , Markus Götz , Fabrice von der Lehr , James Kahn , Martin Siggel , Achim Streit

The Convolutional Neural Network (CNN) model, often used for image classification, requires significant training time to obtain high accuracy. To this end, distributed training is performed with the parameter server (PS) architecture using…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-01-18 Jay H. Park , Sunghwan Kim , Jinwon Lee , Myeongjae Jeon , Sam H. Noh

The training process of Deep Neural Network (DNN) is compute-intensive, often taking days to weeks to train a DNN model. Therefore, parallel execution of DNN training on GPUs is a widely adopted approach to speed up the process nowadays.…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-10-29 Chi-Chung Chen , Chia-Lin Yang , Hsiang-Yun Cheng

When fine-tuning Deep Neural Networks (DNNs) to new data, DNNs are prone to overwriting network parameters required for task-specific functionality on previously learned tasks, resulting in a loss of performance on those tasks. We propose…

Machine Learning · Computer Science 2025-01-22 Christopher Angelini , Nidhal Bouaynaya
‹ Prev 1 2 3 10 Next ›