English
Related papers

Related papers: Quantizing data for distributed learning

200 papers

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…

Machine Learning · Computer Science 2018-02-21 Yusuke Tsuzuku , Hiroto Imachi , Takuya Akiba

Training large machine learning models requires a distributed computing approach, with communication of the model updates being the bottleneck. For this reason, several methods based on the compression (e.g., sparsification and/or…

Machine Learning · Computer Science 2023-12-29 Konstantin Mishchenko , Eduard Gorbunov , Martin Takáč , Peter Richtárik

Large-scale distributed optimization is of great importance in various applications. For data-parallel based distributed learning, the inter-node gradient communication often becomes the performance bottleneck. In this paper, we propose the…

Computer Vision and Pattern Recognition · Computer Science 2018-06-22 Jiaxiang Wu , Weidong Huang , Junzhou Huang , Tong Zhang

We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. Recent advances using the distributed gradient…

Optimization and Control · Mathematics 2019-05-14 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Massive amounts of data have led to the training of large-scale machine learning models on a single worker inefficient. Distributed machine learning methods such as Parallel-SGD have received significant interest as a solution to tackle…

Machine Learning · Computer Science 2022-03-31 S Vineeth

Communication overhead severely hinders the scalability of distributed machine learning systems. Recently, there has been a growing interest in using gradient compression to reduce the communication overhead of the distributed training.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-19 Yuchen Zhong , Cong Xie , Shuai Zheng , Haibin Lin

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers…

Machine Learning · Computer Science 2018-04-25 Guoxin Cui , Jun Xu , Wei Zeng , Yanyan Lan , Jiafeng Guo , Xueqi Cheng

The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative…

Machine Learning · Computer Science 2019-09-18 Jun Sun , Tianyi Chen , Georgios B. Giannakis , Zaiyue Yang

We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. This imperfect communication poses a fundamental…

Optimization and Control · Mathematics 2018-10-30 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

The communication of gradients is costly for training deep neural networks with multiple devices in computer vision applications. In particular, the growing size of deep learning models leads to higher communication overheads that defy the…

Machine Learning · Computer Science 2020-02-26 An Xu , Zhouyuan Huo , Heng Huang

We consider the problem of solving a distributed optimization problem using a distributed computing platform, where the communication in the network is limited: each node can only communicate with its neighbours and the channel has a…

Systems and Control · Computer Science 2015-04-10 Ye Pu , Melanie N. Zeilinger , Colin N. Jones

Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks.…

Machine Learning · Computer Science 2017-12-07 Dan Alistarh , Demjan Grubic , Jerry Li , Ryota Tomioka , Milan Vojnovic

Enabling low precision implementations of deep learning models, without considerable performance degradation, is necessary in resource and latency constrained settings. Moreover, exploiting the differences in sensitivity to quantization…

Machine Learning · Computer Science 2022-10-28 Ignacio Hounie , Juan Elenter , Alejandro Ribeiro

Gradient quantization is an emerging technique in reducing communication costs in distributed learning. Existing gradient quantization algorithms often rely on engineering heuristics or empirical observations, lacking a systematic approach…

Machine Learning · Computer Science 2021-08-02 Guangfeng Yan , Shao-Lun Huang , Tian Lan , Linqi Song

We consider distributed optimization where the objective function is spread among different devices, each sending incremental model updates to a central server. To alleviate the communication bottleneck, recent work proposed various schemes…

Optimization and Control · Mathematics 2019-04-11 Samuel Horváth , Dmitry Kovalev , Konstantin Mishchenko , Sebastian Stich , Peter Richtárik

Training deep neural networks on large datasets containing high-dimensional data requires a large amount of computation. A solution to this problem is data-parallel distributed training, where a model is replicated into several…

Machine Learning · Computer Science 2021-03-18 Lusine Abrahamyan , Yiming Chen , Giannis Bekoulis , Nikos Deligiannis

Data explosion and an increase in model size drive the remarkable advances in large-scale machine learning, but also make model training time-consuming and model storage difficult. To address the above issues in the distributed model…

Machine Learning · Computer Science 2022-08-12 Ke Xu , Jianqiao Wangni , Yifan Zhang , Deheng Ye , Jiaxiang Wu , Peilin Zhao

Due to limited communication resources at the client and a massive number of model parameters, large-scale distributed learning tasks suffer from communication bottleneck. Gradient compression is an effective method to reduce communication…

Machine Learning · Computer Science 2021-11-17 Kai Liang , Huiru Zhong , Haoning Chen , Youlong Wu

A popular track of network compression approach is Quantization aware Training (QAT), which accelerates the forward pass during the neural network training and inference. However, not much prior efforts have been made to quantize and…

Computer Vision and Pattern Recognition · Computer Science 2023-11-02 Kaixin Xu , Alina Hui Xiu Lee , Ziyuan Zhao , Zhe Wang , Min Wu , Weisi Lin

Gradient compression has surfaced as a key technique to address the challenge of communication efficiency in distributed learning. In distributed deep learning, however, it is observed that gradient distributions are heavy-tailed, with…

Machine Learning · Computer Science 2024-02-07 Guangfeng Yan , Tan Li , Yuanzhang Xiao , Hanxu Hou , Linqi Song
‹ Prev 1 2 3 10 Next ›