English
Related papers

Related papers: Optimal Gradient Quantization Condition for Commun…

200 papers

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…

Machine Learning · Computer Science 2018-02-21 Yusuke Tsuzuku , Hiroto Imachi , Takuya Akiba

Gradient quantization is an emerging technique in reducing communication costs in distributed learning. Existing gradient quantization algorithms often rely on engineering heuristics or empirical observations, lacking a systematic approach…

Machine Learning · Computer Science 2021-08-02 Guangfeng Yan , Shao-Lun Huang , Tian Lan , Linqi Song

We consider machine learning applications that train a model by leveraging data distributed over a trusted network, where communication constraints can create a performance bottleneck. A number of recent approaches propose to overcome this…

Machine Learning · Computer Science 2021-09-10 Osama A. Hanna , Yahya H. Ezzeldin , Christina Fragouli , Suhas Diggavi

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers…

Machine Learning · Computer Science 2018-04-25 Guoxin Cui , Jun Xu , Wei Zeng , Yanyan Lan , Jiafeng Guo , Xueqi Cheng

We consider the problem of solving a distributed optimization problem using a distributed computing platform, where the communication in the network is limited: each node can only communicate with its neighbours and the channel has a…

Systems and Control · Computer Science 2015-04-10 Ye Pu , Melanie N. Zeilinger , Colin N. Jones

We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. Recent advances using the distributed gradient…

Optimization and Control · Mathematics 2019-05-14 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Distributed optimization increasingly plays a central role in economical and sustainable operation of cyber-physical systems. Nevertheless, the complete potential of the technology has not yet been fully exploited in practice due to…

Optimization and Control · Mathematics 2017-10-24 Sindri Magnusson , Chinwendu Enyioha , Na Li , Carlo Fischione , Vahid Tarokh

Massive amounts of data have led to the training of large-scale machine learning models on a single worker inefficient. Distributed machine learning methods such as Parallel-SGD have received significant interest as a solution to tackle…

Machine Learning · Computer Science 2022-03-31 S Vineeth

Quantization reduces computation costs of neural networks but suffers from performance degeneration. Is this accuracy drop due to the reduced capacity, or inefficient training during the quantization procedure? After looking into the…

Computer Vision and Pattern Recognition · Computer Science 2019-12-24 Qing Jin , Linjie Yang , Zhenyu Liao

Large-scale distributed optimization is of great importance in various applications. For data-parallel based distributed learning, the inter-node gradient communication often becomes the performance bottleneck. In this paper, we propose the…

Computer Vision and Pattern Recognition · Computer Science 2018-06-22 Jiaxiang Wu , Weidong Huang , Junzhou Huang , Tong Zhang

Training neural networks requires significant computational resources and energy. Methods like mixed-precision and quantization-aware training reduce bit usage, yet they still depend heavily on computationally expensive gradient-based…

Machine Learning · Computer Science 2025-09-30 Noa Cohen , Omkar Joglekar , Dotan Di Castro , Vladimir Tchuiev , Shir Kozlovsky , Michal Moshkovitz

Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks.…

Machine Learning · Computer Science 2017-12-07 Dan Alistarh , Demjan Grubic , Jerry Li , Ryota Tomioka , Milan Vojnovic

Quantized neural networks (QNNs) are among the main approaches for deploying deep neural networks on low resource edge devices. Training QNNs using different levels of precision throughout the network (dynamic quantization) typically…

Machine Learning · Computer Science 2021-02-19 Benjamin J. Bodner , Gil Ben Shalom , Eran Treister

In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of…

Machine Learning · Computer Science 2021-09-14 Xiangyi Chen , Xiaoyun Li , Ping Li

Communication overhead is the key challenge for distributed training. Gradient compression is a widely used approach to reduce communication traffic. When combining with parallel communication mechanism method like pipeline, gradient…

Machine Learning · Computer Science 2021-09-08 Enda Yu , Dezun Dong , Yemao Xu , Shuo Ouyang , Xiangke Liao

We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. This imperfect communication poses a fundamental…

Optimization and Control · Mathematics 2018-10-30 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Modern distributed training of machine learning models suffers from high communication overhead for synchronizing stochastic gradients and model parameters. In this paper, to reduce the communication complexity, we propose \emph{double…

Optimization and Control · Mathematics 2019-05-28 Yue Yu , Jiaxiang Wu , Longbo Huang

Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning, especially in bandwidth-limited settings and high-dimensional models. Gradient quantization is an effective…

Machine Learning · Computer Science 2021-02-10 Divyansh Jhunjhunwala , Advait Gadhikar , Gauri Joshi , Yonina C. Eldar

Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as…

Machine Learning · Computer Science 2017-10-31 Jianqiao Wangni , Jialei Wang , Ji Liu , Tong Zhang

Synchronous stochastic gradient descent (SGD) is the most common method used for distributed training of deep learning models. In this algorithm, each worker shares its local gradients with others and updates the parameters using the…

Machine Learning · Computer Science 2020-09-22 Negar Foroutan Eghlidi , Martin Jaggi
‹ Prev 1 2 3 10 Next ›