Related papers: Optimal Complexity in Decentralized Training

Optimal Complexity in Non-Convex Decentralized Learning over Time-Varying Networks

Decentralized optimization with time-varying networks is an emerging paradigm in machine learning. It saves remarkable communication overhead in large-scale deep training and is more robust in wireless scenarios especially when nodes are…

Machine Learning · Computer Science 2022-11-02 Xinmeng Huang , Kun Yuan

Multi-consensus Decentralized Accelerated Gradient Descent

This paper considers the decentralized convex optimization problem, which has a wide range of applications in large-scale machine learning, sensor networks, and control theory. We propose novel algorithms that achieve optimal computation…

Machine Learning · Computer Science 2023-10-11 Haishan Ye , Luo Luo , Ziang Zhou , Tong Zhang

Locally Asynchronous Stochastic Gradient Descent for Decentralised Deep Learning

Distributed training algorithms of deep neural networks show impressive convergence speedup properties on very large problems. However, they inherently suffer from communication related slowdowns and communication topology becomes a crucial…

Machine Learning · Computer Science 2022-03-25 Tomer Avidor , Nadav Tal Israel

Achieving Linear Speedup and Near-Optimal Complexity for Decentralized Optimization over Row-stochastic Networks

A key challenge in decentralized optimization is determining the optimal convergence rate and designing algorithms to achieve it. While this problem has been extensively addressed for doubly-stochastic and column-stochastic mixing matrices,…

Optimization and Control · Mathematics 2025-06-06 Liyuan Liang , Xinyi Chen , Gan Luo , Kun Yuan

Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an…

Machine Learning · Computer Science 2021-06-21 Tao Lin , Sai Praneeth Karimireddy , Sebastian U. Stich , Martin Jaggi

Accelerating Optimization and Machine Learning through Decentralization

Decentralized optimization enables multiple devices to learn a global machine learning model while each individual device only has access to its local dataset. By avoiding the need for training data to leave individual users' devices, it…

Machine Learning · Computer Science 2026-04-22 Ziqin Chen , Zuang Wang , Yongqiang Wang

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based…

Optimization and Control · Mathematics 2019-10-29 Sulaiman A. Alghunaim , Kun Yuan , Ali H. Sayed

Accelerating Decentralized Optimization via Overlapping Local Steps

Decentralized optimization has emerged as a critical paradigm for distributed learning, enabling scalable training while preserving data privacy through peer-to-peer collaboration. However, existing methods often suffer from communication…

Machine Learning · Computer Science 2026-01-06 Yijie Zhou , Shi Pu

Enhancing Parallelism in Decentralized Stochastic Convex Optimization

Decentralized learning has emerged as a powerful approach for handling large datasets across multiple machines in a communication-efficient manner. However, such methods often face scalability limitations, as increasing the number of…

Machine Learning · Computer Science 2025-06-03 Ofri Eisen , Ron Dorfman , Kfir Y. Levy

Hop: Heterogeneity-Aware Decentralized Training

Recent work has shown that decentralized algorithms can deliver superior performance over centralized ones in the context of machine learning. The two approaches, with the main difference residing in their distinct communication patterns,…

Distributed, Parallel, and Cluster Computing · Computer Science 2019-02-08 Qinyi Luo , Jinkun Lin , Youwei Zhuo , Xuehai Qian

Decentralized Stochastic Proximal Gradient Descent with Variance Reduction over Time-varying Networks

In decentralized learning, a network of nodes cooperate to minimize an overall objective function that is usually the finite-sum of their local objectives, and incorporates a non-smooth regularization term for the better generalization…

Machine Learning · Computer Science 2022-01-25 Xuanjie Li , Yuedong Xu , Jessie Hui Wang , Xin Wang , John C. S. Lui

Beyond Centralization: Provable Communication Efficient Decentralized Multi-Task Learning

Representation learning is a widely adopted framework for learning in data-scarce environments, aiming to extract common features from related tasks. While centralized approaches have been extensively studied, decentralized methods remain…

Machine Learning · Computer Science 2025-12-30 Donghwa Kang , Shana Moothedath

Efficient Decentralized Deep Learning by Dynamic Model Averaging

We propose an efficient protocol for decentralized training of deep neural networks from distributed data sources. The proposed protocol allows to handle different phases of model training equally well and to quickly adapt to concept…

Machine Learning · Computer Science 2018-11-14 Michael Kamp , Linara Adilova , Joachim Sicking , Fabian Hüger , Peter Schlicht , Tim Wirtz , Stefan Wrobel

DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models

Recent advances in decentralized deep learning algorithms have demonstrated cutting-edge performance on various tasks with large pre-trained models. However, a pivotal prerequisite for achieving this level of competitiveness is the…

Machine Learning · Computer Science 2024-04-15 Nastaran Saadati , Minh Pham , Nasla Saleem , Joshua R. Waite , Aditya Balu , Zhanhong Jiang , Chinmay Hegde , Soumik Sarkar

Communication Compression for Decentralized Training

Optimizing distributed learning systems is an art of balancing between computation and communication. There have been two lines of research that try to deal with slower networks: {\em communication compression} for low bandwidth networks,…

Machine Learning · Computer Science 2019-02-04 Hanlin Tang , Shaoduo Gan , Ce Zhang , Tong Zhang , Ji Liu

A Low Complexity Decentralized Neural Net with Centralized Equivalence using Layer-wise Learning

We design a low complexity decentralized learning algorithm to train a recently proposed large neural network in distributed processing nodes (workers). We assume the communication network between the workers is synchronized and can be…

Machine Learning · Computer Science 2020-09-30 Xinyue Liang , Alireza M. Javid , Mikael Skoglund , Saikat Chatterjee

A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning

We study the consensus decentralized optimization problem where the objective function is the average of $n$ agents private non-convex cost functions; moreover, the agents can only communicate to their neighbors on a given network topology.…

Distributed, Parallel, and Cluster Computing · Computer Science 2022-07-20 Sulaiman A. Alghunaim , Kun Yuan

On Generalization of Decentralized Learning with Separable Data

Decentralized learning offers privacy and communication efficiency when data are naturally distributed among agents communicating over an underlying graph. Motivated by overparameterized learning settings, in which models are trained to…

Machine Learning · Computer Science 2023-03-28 Hossein Taheri , Christos Thrampoulidis

Robust and Communication-Efficient Collaborative Learning

We consider a decentralized learning problem, where a set of computing nodes aim at solving a non-convex optimization problem collaboratively. It is well-known that decentralized optimization schemes face two major system bottlenecks:…

Machine Learning · Computer Science 2019-11-04 Amirhossein Reisizadeh , Hossein Taheri , Aryan Mokhtari , Hamed Hassani , Ramtin Pedarsani

From promise to practice: realizing high-performance decentralized training

Decentralized training of deep neural networks has attracted significant attention for its theoretically superior scalability over synchronous data-parallel methods like All-Reduce. However, realizing this potential in multi-node training…

Distributed, Parallel, and Cluster Computing · Computer Science 2024-10-17 Zesen Wang , Jiaojiao Zhang , Xuyang Wu , Mikael Johansson