Related papers: Contractive error feedback for gradient compressio…

A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning

Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed compute systems. A key bottleneck of such systems is the communication overhead for exchanging information across…

Machine Learning · Computer Science 2021-03-16 Samuel Horváth , Peter Richtárik

Safe-EF: Error Feedback for Nonsmooth Constrained Optimization

Federated learning faces severe communication bottlenecks due to the high dimensionality of model updates. Communication compression with contractive compressors (e.g., Top-K) is often preferable in practice but can degrade performance…

Machine Learning · Computer Science 2025-06-04 Rustem Islamov , Yarden As , Ilyas Fatkhullin

Analysis of Error Feedback in Federated Non-Convex Optimization with Biased Compression

In federated learning (FL) systems, e.g., wireless networks, the communication cost between the clients and the central server can often be a bottleneck. To reduce the communication cost, the paradigm of communication compression has become…

Machine Learning · Statistics 2022-11-28 Xiaoyun Li , Ping Li

Error Feedback Fixes SignSGD and other Gradient Compression Schemes

Sign-based algorithms (e.g. signSGD) have been proposed as a biased gradient compression technique to alleviate the communication bottleneck in training large neural networks across multiple workers. We show simple convex counter-examples…

Machine Learning · Computer Science 2019-05-30 Sai Praneeth Karimireddy , Quentin Rebjock , Sebastian U. Stich , Martin Jaggi

Toward Communication Efficient Adaptive Gradient Method

In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of…

Machine Learning · Computer Science 2021-09-14 Xiangyi Chen , Xiaoyun Li , Ping Li

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback

Error feedback (EF), also known as error compensation, is an immensely popular convergence stabilization mechanism in the context of distributed training of supervised machine learning models enhanced by the use of contractive communication…

Machine Learning · Computer Science 2021-06-10 Peter Richtárik , Igor Sokolov , Ilyas Fatkhullin

Detached Error Feedback for Distributed SGD with Random Sparsification

The communication bottleneck has been a critical problem in large-scale distributed deep learning. In this work, we study distributed SGD with random block-wise sparsification as the gradient compressor, which is ring-allreduce compatible…

Machine Learning · Computer Science 2022-06-14 An Xu , Heng Huang

Accelerated Distributed Optimization with Compression and Error Feedback

Modern machine learning tasks often involve massive datasets and models, necessitating distributed optimization algorithms with reduced communication overhead. Communication compression, where clients transmit compressed updates to a…

Optimization and Control · Mathematics 2025-04-01 Yuan Gao , Anton Rodomanov , Jeremy Rack , Sebastian U. Stich

Communication-efficient Vertical Federated Learning via Compressed Error Feedback

Communication overhead is a known bottleneck in federated learning (FL). To address this, lossy compression is commonly used on the information communicated between the server and clients during training. In horizontal FL, where each client…

Machine Learning · Computer Science 2025-02-25 Pedro Valdeira , João Xavier , Cláudia Soares , Yuejie Chi

SA-PEF: Step-Ahead Partial Error Feedback for Efficient Federated Learning

Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalled progress in the early rounds. We…

Machine Learning · Computer Science 2026-05-26 Dawit Kiros Redie , Reza Arablouei , Stefan Werner

Accelerated Sparsified SGD with Error Feedback

A stochastic gradient method for synchronous distributed optimization is studied. For reducing communication cost, we particularly focus on utilization of compression of communicated gradients. Several work has shown that {\it{sparsified}}…

Optimization and Control · Mathematics 2020-06-22 Tomoya Murata , Taiji Suzuki

EF21 with Bells & Whistles: Six Algorithmic Extensions of Modern Error Feedback

First proposed by Seide (2014) as a heuristic, error feedback (EF) is a very popular mechanism for enforcing convergence of distributed gradient-based optimization methods enhanced with communication compression strategies based on the…

Machine Learning · Computer Science 2025-06-23 Ilyas Fatkhullin , Igor Sokolov , Eduard Gorbunov , Zhize Li , Peter Richtárik

ProgFed: Effective, Communication, and Computation Efficient Federated Learning by Progressive Training

Federated learning is a powerful distributed learning scheme that allows numerous edge devices to collaboratively train a model without sharing their data. However, training is resource-intensive for edge devices, and limited network…

Machine Learning · Computer Science 2024-10-25 Hui-Po Wang , Sebastian U. Stich , Yang He , Mario Fritz

EControl: Fast Distributed Optimization with Compression and Error Control

Modern distributed training relies heavily on communication compression to reduce the communication overhead. In this work, we study algorithms employing a popular class of contractive compressors in order to reduce communication overhead.…

Optimization and Control · Mathematics 2023-11-13 Yuan Gao , Rustem Islamov , Sebastian Stich

Communication Compression for Distributed Learning without Control Variates

Distributed learning algorithms, such as the ones employed in Federated Learning (FL), require communication compression to reduce the cost of client uploads. The compression methods used in practice are often biased, making error feedback…

Machine Learning · Computer Science 2025-09-12 Tomas Ortega , Chun-Yin Huang , Xiaoxiao Li , Hamid Jafarkhani

Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants

Error Feedback (EF) is a highly popular and immensely effective mechanism for fixing convergence issues which arise in distributed training methods (such as distributed GD or SGD) when these are enhanced with greedy communication…

Machine Learning · Computer Science 2024-02-19 Peter Richtárik , Elnur Gasanov , Konstantin Burlachenko

Biased Compression in Gradient Coding for Distributed Learning

Communication bottlenecks and the presence of stragglers pose significant challenges in distributed learning (DL). To deal with these challenges, recent advances leverage unbiased compression functions and gradient coding. However, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-18 Chengxi Li , Ming Xiao , Mikael Skoglund

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

Although the distributed machine learning methods can speed up the training of large deep neural networks, the communication cost has become the non-negligible bottleneck to constrain the performance. To address this challenge, the gradient…

Machine Learning · Computer Science 2022-01-25 An Xu , Zhouyuan Huo , Heng Huang

Composite Optimization with Error Feedback: the Dual Averaging Approach

Communication efficiency is a central challenge in distributed machine learning training, and message compression is a widely used solution. However, standard Error Feedback (EF) methods (Seide et al., 2014), though effective for smooth…

Optimization and Control · Mathematics 2025-10-07 Yuan Gao , Anton Rodomanov , Jeremy Rack , Sebastian Stich

Variance-based Gradient Compression for Efficient Distributed Deep Learning

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…

Machine Learning · Computer Science 2018-02-21 Yusuke Tsuzuku , Hiroto Imachi , Takuya Akiba