Related papers: Compression for Distributed Optimization and Timel…

Distributed learning with compressed gradients

Asynchronous computation and gradient compression have emerged as two key techniques for achieving scalability in distributed optimization for large-scale machine learning. This paper presents a unified analysis framework for distributed…

Optimization and Control · Mathematics 2018-11-30 Sarit Khirirat , Hamid Reza Feyzmahdavian , Mikael Johansson

Distributed Delayed Stochastic Optimization

We analyze the convergence of gradient-based optimization algorithms that base their updates on delayed stochastic gradient information. The main application of our results is to the development of gradient-based distributed optimization…

Optimization and Control · Mathematics 2011-05-02 Alekh Agarwal , John C. Duchi

Hybrid Decentralized Optimization: Leveraging Both First- and Zeroth-Order Optimizers for Faster Convergence

Distributed optimization is the standard way of speeding up machine learning training, and most of the research in the area focuses on distributed first-order, gradient-based methods. Yet, there are settings where some…

Machine Learning · Computer Science 2025-11-03 Matin Ansaripour , Shayan Talaei , Giorgi Nadiradze , Dan Alistarh

Real Acceleration of Communication Process in Distributed Algorithms with Compression

Modern applied optimization problems become more and more complex every day. Due to this fact, distributed algorithms that can speed up the process of solving an optimization problem through parallelization are of great importance. The main…

Optimization and Control · Mathematics 2023-12-14 Svetlana Tkachenko , Artem Andreev , Aleksandr Beznosikov , Alexander Gasnikov

Quantization Design for Distributed Optimization

We consider the problem of solving a distributed optimization problem using a distributed computing platform, where the communication in the network is limited: each node can only communicate with its neighbours and the channel has a…

Systems and Control · Computer Science 2015-04-10 Ye Pu , Melanie N. Zeilinger , Colin N. Jones

Variance-based Gradient Compression for Efficient Distributed Deep Learning

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…

Machine Learning · Computer Science 2018-02-21 Yusuke Tsuzuku , Hiroto Imachi , Takuya Akiba

Accelerated Methods with Compressed Communications for Distributed Optimization Problems under Data Similarity

In recent years, as data and problem sizes have increased, distributed learning has become an essential tool for training high-performance models. However, the communication bottleneck, especially for high-dimensional data, is a challenge.…

Optimization and Control · Mathematics 2025-04-28 Dmitry Bylinkin , Aleksandr Beznosikov

On the convergence rate of distributed gradient methods for finite-sum optimization under communication delays

Motivated by applications in machine learning and statistics, we study distributed optimization problems over a network of processors, where the goal is to optimize a global objective composed of a sum of local functions. In these problems,…

Optimization and Control · Mathematics 2019-05-14 Thinh T. Doan , Carolyn L. Beck , R. Srikant

Communication-Efficient Distributed SGD with Compressed Sensing

We consider large scale distributed optimization over a set of edge devices connected to a central server, where the limited communication bandwidth between the server and edge devices imposes a significant bottleneck for the optimization…

Optimization and Control · Mathematics 2021-12-28 Yujie Tang , Vikram Ramanathan , Junshan Zhang , Na Li

Compressed Gradient Tracking Algorithms for Distributed Nonconvex Optimization

In this paper, we study the distributed nonconvex optimization problem, which aims to minimize the average value of the local nonconvex cost functions using local information exchange. To reduce the communication overhead, we introduce…

Optimization and Control · Mathematics 2025-02-12 Lei Xu , Xinlei Yi , Guanghui Wen , Yang Shi , Karl H. Johansson , Tao Yang

CD-SGD: Distributed Stochastic Gradient Descent with Compression and Delay Compensation

Communication overhead is the key challenge for distributed training. Gradient compression is a widely used approach to reduce communication traffic. When combining with parallel communication mechanism method like pipeline, gradient…

Machine Learning · Computer Science 2021-09-08 Enda Yu , Dezun Dong , Yemao Xu , Shuo Ouyang , Xiangke Liao

Communication Compression for Distributed Nonconvex Optimization

This paper considers distributed nonconvex optimization with the cost functions being distributed over agents. Noting that information compression is a key tool to reduce the heavy communication load for distributed algorithms as agents…

Optimization and Control · Mathematics 2022-10-10 Xinlei Yi , Shengjun Zhang , Tao Yang , Tianyou Chai , Karl H. Johansson

Distributed Stochastic Approximation for Solving Network Optimization Problems Under Random Quantization

We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. This imperfect communication poses a fundamental…

Optimization and Control · Mathematics 2018-10-30 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

A Distributed Optimization Algorithm over Time-Varying Graphs with Efficient Gradient Evaluations

We propose an algorithm for distributed optimization over time-varying communication networks. Our algorithm uses an optimized ratio between the number of rounds of communication and gradient evaluations to achieve fast convergence. The…

Optimization and Control · Mathematics 2020-01-08 Bryan Van Scoy , Laurent Lessard

Temporal Predictive Coding for Gradient Compression in Distributed Learning

This paper proposes a prediction-based gradient compression method for distributed learning with event-triggered communication. Our goal is to reduce the amount of information transmitted from the distributed agents to the parameter server…

Information Theory · Computer Science 2024-10-04 Adrian Edin , Zheng Chen , Michel Kieffer , Mikael Johansson

Robust and Efficient Distributed Compression for Cloud Radio Access Networks

This work studies distributed compression for the uplink of a cloud radio access network where multiple multi-antenna base stations (BSs) are connected to a central unit, also referred to as cloud decoder, via capacity-constrained backhaul…

Information Theory · Computer Science 2012-06-19 Seok-Hwan Park , Osvaldo Simeone , Onur Sahin , Shlomo Shamai

Optimal Gradient Compression for Distributed and Federated Learning

Communicating information, like gradient vectors, between computing nodes in distributed and federated learning is typically an unavoidable burden, resulting in scalability issues. Indeed, communication might be slow and costly. Recent…

Machine Learning · Computer Science 2020-10-08 Alyazeed Albasyoni , Mher Safaryan , Laurent Condat , Peter Richtárik

Wyner-Ziv Estimators for Distributed Mean Estimation with Side Information and Optimization

Communication efficient distributed mean estimation is an important primitive that arises in many distributed learning and optimization scenarios such as federated learning. Without any probabilistic assumptions on the underlying data, we…

Information Theory · Computer Science 2022-11-15 Prathamesh Mayekar , Shubham Jha , Ananda Theertha Suresh , Himanshu Tyagi

Decentralized Composite Optimization with Compression

Decentralized optimization and communication compression have exhibited their great potential in accelerating distributed machine learning by mitigating the communication bottleneck in practice. While existing decentralized algorithms with…

Machine Learning · Computer Science 2021-08-13 Yao Li , Xiaorui Liu , Jiliang Tang , Ming Yan , Kun Yuan

Improved Quantization Strategies for Managing Heavy-tailed Gradients in Distributed Learning

Gradient compression has surfaced as a key technique to address the challenge of communication efficiency in distributed learning. In distributed deep learning, however, it is observed that gradient distributions are heavy-tailed, with…

Machine Learning · Computer Science 2024-02-07 Guangfeng Yan , Tan Li , Yuanzhang Xiao , Hanxu Hou , Linqi Song