Related papers: Sparse Binary Compression: Towards Distributed Dee…

Sparse Communication for Training Deep Networks

Synchronous stochastic gradient descent (SGD) is the most common method used for distributed training of deep learning models. In this algorithm, each worker shares its local gradients with others and updates the parameters using the…

Machine Learning · Computer Science 2020-09-22 Negar Foroutan Eghlidi , Martin Jaggi

Robust and Communication-Efficient Federated Learning from Non-IID Data

Federated Learning allows multiple parties to jointly train a deep learning model on their combined data, without any of the participants having to reveal their local data to a centralized server. This form of privacy-preserving…

Machine Learning · Computer Science 2019-03-08 Felix Sattler , Simon Wiedemann , Klaus-Robert Müller , Wojciech Samek

DeepReduce: A Sparse-tensor Communication Framework for Distributed Deep Learning

Sparse tensors appear frequently in distributed deep learning, either as a direct artifact of the deep neural network's gradients, or as a result of an explicit sparsification process. Existing communication primitives are agnostic to the…

Machine Learning · Computer Science 2021-02-08 Kelly Kostopoulou , Hang Xu , Aritra Dutta , Xin Li , Alexandros Ntoulas , Panos Kalnis

Optimal Gradient Compression for Distributed and Federated Learning

Communicating information, like gradient vectors, between computing nodes in distributed and federated learning is typically an unavoidable burden, resulting in scalability issues. Indeed, communication might be slow and costly. Recent…

Machine Learning · Computer Science 2020-10-08 Alyazeed Albasyoni , Mher Safaryan , Laurent Condat , Peter Richtárik

Learned Gradient Compression for Distributed Deep Learning

Training deep neural networks on large datasets containing high-dimensional data requires a large amount of computation. A solution to this problem is data-parallel distributed training, where a model is replicated into several…

Machine Learning · Computer Science 2021-03-18 Lusine Abrahamyan , Yiming Chen , Giannis Bekoulis , Nikos Deligiannis

Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training

Large-scale distributed training requires significant communication bandwidth for gradient exchange that limits the scalability of multi-node training, and requires expensive high-bandwidth network infrastructure. The situation gets even…

Computer Vision and Pattern Recognition · Computer Science 2020-06-24 Yujun Lin , Song Han , Huizi Mao , Yu Wang , William J. Dally

Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection

Distributed learning techniques such as federated learning have enabled multiple workers to train machine learning models together to reduce the overall training time. However, current distributed training algorithms (centralized or…

Machine Learning · Computer Science 2020-02-25 Zhenheng Tang , Shaohuai Shi , Xiaowen Chu

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

The recent many-fold increase in the size of deep neural networks makes efficient distributed training challenging. Many proposals exploit the compressibility of the gradients and propose lossy compression techniques to speed up the…

Machine Learning · Computer Science 2021-03-19 Ahmed M. Abdelmoniem , Ahmed Elzanaty , Mohamed-Slim Alouini , Marco Canini

S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning

Distributed stochastic gradient descent (SGD) approach has been widely used in large-scale deep learning, and the gradient collective method is vital to ensure the training scalability of the distributed deep learning system. Collective…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-11-29 Keshi Ge , Yongquan Fu , Zhiquan Lai , Xiaoge Deng , Dongsheng Li

Compressed Communication for Distributed Training: Adaptive Methods and System

Communication overhead severely hinders the scalability of distributed machine learning systems. Recently, there has been a growing interest in using gradient compression to reduce the communication overhead of the distributed training.…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-05-19 Yuchen Zhong , Cong Xie , Shuai Zheng , Haibin Lin

PacTrain: Pruning and Adaptive Sparse Gradient Compression for Efficient Collective Communication in Distributed Deep Learning

Large-scale deep neural networks (DNN) exhibit excellent performance for various tasks. As DNNs and datasets grow, distributed training becomes extremely time-consuming and demands larger clusters. A main bottleneck is the resulting…

Distributed, Parallel, and Cluster Computing · Computer Science 2025-05-27 Yisu Wang , Ruilong Wu , Xinjiao Li , Dirk Kutscher

Downlink Compression Improves TopK Sparsification

Training large neural networks is time consuming. To speed up the process, distributed training is often used. One of the largest bottlenecks in distributed training is communicating gradients across different nodes. Different gradient…

Machine Learning · Computer Science 2022-10-03 William Zou , Hans De Sterck , Jun Liu

SuperNeurons: FFT-based Gradient Sparsification in the Distributed Training of Deep Neural Networks

The performance and efficiency of distributed training of Deep Neural Networks highly depend on the performance of gradient averaging among all participating nodes, which is bounded by the communication between nodes. There are two major…

Distributed, Parallel, and Cluster Computing · Computer Science 2021-02-10 Linnan Wang , Wei Wu , Junyu Zhang , Hang Liu , George Bosilca , Maurice Herlihy , Rodrigo Fonseca

CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation

Communication overhead is the key challenge for distributed training. Gradient compression is a widely used approach to reduce communication traffic. When combining with parallel communication mechanism method like pipeline, gradient…

Machine Learning · Computer Science 2021-09-08 Enda Yu , Dezun Dong , Yemao Xu , Shuo Ouyang , Xiangke Liao

Distributed SLIDE: Enabling Training Large Neural Networks on Low Bandwidth and Simple CPU-Clusters via Model Parallelism and Sparsity

More than 70% of cloud computing is paid for but sits idle. A large fraction of these idle compute are cheap CPUs with few cores that are not utilized during the less busy hours. This paper aims to enable those CPU cycles to train…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-02-01 Minghao Yan , Nicholas Meisburger , Tharun Medini , Anshumali Shrivastava

ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training

Large-scale distributed training of Deep Neural Networks (DNNs) on state-of-the-art platforms is expected to be severely communication constrained. To overcome this limitation, numerous gradient compression techniques have been proposed and…

Machine Learning · Computer Science 2021-04-23 Chia-Yu Chen , Jiamin Ni , Songtao Lu , Xiaodong Cui , Pin-Yu Chen , Xiao Sun , Naigang Wang , Swagath Venkataramani , Vijayalakshmi Srinivasan , Wei Zhang , Kailash Gopalakrishnan

Distributed Sparse Linear Regression under Communication Constraints

In multiple domains, statistical tasks are performed in distributed settings, with data split among several end machines that are connected to a fusion center. In various applications, the end machines have limited bandwidth and power, and…

Machine Learning · Computer Science 2026-01-05 Rodney Fonseca , Boaz Nadler

Variance-based Gradient Compression for Efficient Distributed Deep Learning

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…

Machine Learning · Computer Science 2018-02-21 Yusuke Tsuzuku , Hiroto Imachi , Takuya Akiba

Sparsifying Binary Networks

Binary neural networks (BNNs) have demonstrated their ability to solve complex tasks with comparable accuracy as full-precision deep neural networks (DNNs), while also reducing computational power and storage requirements and increasing the…

Machine Learning · Computer Science 2022-07-12 Riccardo Schiavone , Maria A. Zuluaga

Sparse Random Networks for Communication-Efficient Federated Learning

One main challenge in federated learning is the large communication cost of exchanging weight updates from clients to the server at each round. While prior work has made great progress in compressing the weight updates through gradient…

Machine Learning · Computer Science 2023-02-10 Berivan Isik , Francesco Pase , Deniz Gunduz , Tsachy Weissman , Michele Zorzi