English
Related papers

Related papers: Contractive error feedback for gradient compressio…

200 papers

Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed compute systems. A key bottleneck of such systems is the communication overhead for exchanging information across…

Machine Learning · Computer Science 2021-03-16 Samuel Horváth , Peter Richtárik

Federated learning faces severe communication bottlenecks due to the high dimensionality of model updates. Communication compression with contractive compressors (e.g., Top-K) is often preferable in practice but can degrade performance…

Machine Learning · Computer Science 2025-06-04 Rustem Islamov , Yarden As , Ilyas Fatkhullin

In federated learning (FL) systems, e.g., wireless networks, the communication cost between the clients and the central server can often be a bottleneck. To reduce the communication cost, the paradigm of communication compression has become…

Machine Learning · Statistics 2022-11-28 Xiaoyun Li , Ping Li

Sign-based algorithms (e.g. signSGD) have been proposed as a biased gradient compression technique to alleviate the communication bottleneck in training large neural networks across multiple workers. We show simple convex counter-examples…

Machine Learning · Computer Science 2019-05-30 Sai Praneeth Karimireddy , Quentin Rebjock , Sebastian U. Stich , Martin Jaggi

In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of…

Machine Learning · Computer Science 2021-09-14 Xiangyi Chen , Xiaoyun Li , Ping Li

Error feedback (EF), also known as error compensation, is an immensely popular convergence stabilization mechanism in the context of distributed training of supervised machine learning models enhanced by the use of contractive communication…

Machine Learning · Computer Science 2021-06-10 Peter Richtárik , Igor Sokolov , Ilyas Fatkhullin

The communication bottleneck has been a critical problem in large-scale distributed deep learning. In this work, we study distributed SGD with random block-wise sparsification as the gradient compressor, which is ring-allreduce compatible…

Machine Learning · Computer Science 2022-06-14 An Xu , Heng Huang

Modern machine learning tasks often involve massive datasets and models, necessitating distributed optimization algorithms with reduced communication overhead. Communication compression, where clients transmit compressed updates to a…

Optimization and Control · Mathematics 2025-04-01 Yuan Gao , Anton Rodomanov , Jeremy Rack , Sebastian U. Stich

Communication overhead is a known bottleneck in federated learning (FL). To address this, lossy compression is commonly used on the information communicated between the server and clients during training. In horizontal FL, where each client…

Machine Learning · Computer Science 2025-02-25 Pedro Valdeira , João Xavier , Cláudia Soares , Yuejie Chi

Biased gradient compression with error feedback (EF) reduces communication in federated learning (FL), but under non-IID data, the residual error can decay slowly, causing gradient mismatch and stalled progress in the early rounds. We…

Machine Learning · Computer Science 2026-05-26 Dawit Kiros Redie , Reza Arablouei , Stefan Werner

A stochastic gradient method for synchronous distributed optimization is studied. For reducing communication cost, we particularly focus on utilization of compression of communicated gradients. Several work has shown that {\it{sparsified}}…

Optimization and Control · Mathematics 2020-06-22 Tomoya Murata , Taiji Suzuki

First proposed by Seide (2014) as a heuristic, error feedback (EF) is a very popular mechanism for enforcing convergence of distributed gradient-based optimization methods enhanced with communication compression strategies based on the…

Machine Learning · Computer Science 2025-06-23 Ilyas Fatkhullin , Igor Sokolov , Eduard Gorbunov , Zhize Li , Peter Richtárik

Federated learning is a powerful distributed learning scheme that allows numerous edge devices to collaboratively train a model without sharing their data. However, training is resource-intensive for edge devices, and limited network…

Machine Learning · Computer Science 2024-10-25 Hui-Po Wang , Sebastian U. Stich , Yang He , Mario Fritz

Modern distributed training relies heavily on communication compression to reduce the communication overhead. In this work, we study algorithms employing a popular class of contractive compressors in order to reduce communication overhead.…

Optimization and Control · Mathematics 2023-11-13 Yuan Gao , Rustem Islamov , Sebastian Stich

Distributed learning algorithms, such as the ones employed in Federated Learning (FL), require communication compression to reduce the cost of client uploads. The compression methods used in practice are often biased, making error feedback…

Machine Learning · Computer Science 2025-09-12 Tomas Ortega , Chun-Yin Huang , Xiaoxiao Li , Hamid Jafarkhani

Error Feedback (EF) is a highly popular and immensely effective mechanism for fixing convergence issues which arise in distributed training methods (such as distributed GD or SGD) when these are enhanced with greedy communication…

Machine Learning · Computer Science 2024-02-19 Peter Richtárik , Elnur Gasanov , Konstantin Burlachenko

Communication bottlenecks and the presence of stragglers pose significant challenges in distributed learning (DL). To deal with these challenges, recent advances leverage unbiased compression functions and gradient coding. However, the…

Distributed, Parallel, and Cluster Computing · Computer Science 2026-03-18 Chengxi Li , Ming Xiao , Mikael Skoglund

Although the distributed machine learning methods can speed up the training of large deep neural networks, the communication cost has become the non-negligible bottleneck to constrain the performance. To address this challenge, the gradient…

Machine Learning · Computer Science 2022-01-25 An Xu , Zhouyuan Huo , Heng Huang

Communication efficiency is a central challenge in distributed machine learning training, and message compression is a widely used solution. However, standard Error Feedback (EF) methods (Seide et al., 2014), though effective for smooth…

Optimization and Control · Mathematics 2025-10-07 Yuan Gao , Anton Rodomanov , Jeremy Rack , Sebastian Stich

Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently…

Machine Learning · Computer Science 2018-02-21 Yusuke Tsuzuku , Hiroto Imachi , Takuya Akiba
‹ Prev 1 2 3 10 Next ›