Communication Compression for Distributed Learning without Control Variates

Tomas Ortega; Chun-Yin Huang; Xiaoxiao Li; Hamid Jafarkhani

Communication Compression for Distributed Learning without Control Variates

Machine Learning 2025-09-12 v2 Signal Processing Optimization and Control

Authors: Tomas Ortega , Chun-Yin Huang , Xiaoxiao Li , Hamid Jafarkhani

Abstract

Distributed learning algorithms, such as the ones employed in Federated Learning (FL), require communication compression to reduce the cost of client uploads. The compression methods used in practice are often biased, making error feedback necessary both to achieve convergence under aggressive compression and to provide theoretical convergence guarantees. However, error feedback requires client-specific control variates, creating two key challenges: it violates privacy-preserving principles and demands stateful clients. In this paper, we propose Compressed Aggregate Feedback (CAFe), a novel distributed learning framework that allows highly compressible client updates by exploiting past aggregated updates, and does not require control variates. We consider Distributed Gradient Descent (DGD) as a representative algorithm and analytically prove CAFe's superiority to Distributed Compressed Gradient Descent (DCGD) with biased compression in the non-convex regime with bounded gradient dissimilarity. Experimental results confirm that CAFe outperforms existing distributed learning compression schemes.

Keywords

distributed training federated learning federated graph learning

Cite

@article{arxiv.2412.04538,
  title  = {Communication Compression for Distributed Learning without Control Variates},
  author = {Tomas Ortega and Chun-Yin Huang and Xiaoxiao Li and Hamid Jafarkhani},
  journal= {arXiv preprint arXiv:2412.04538},
  year   = {2025}
}

Comments

Revised format and minor exposition edits, results unchanged

Communication Compression for Distributed Learning without Control Variates

Abstract

Keywords

Cite

Comments

Related papers