Related papers: Robust and Communication-Efficient Collaborative L…

Accelerating Decentralized Optimization via Overlapping Local Steps

Decentralized optimization has emerged as a critical paradigm for distributed learning, enabling scalable training while preserving data privacy through peer-to-peer collaboration. However, existing methods often suffer from communication…

Machine Learning · Computer Science 2026-01-06 Yijie Zhou , Shi Pu

Communication-Efficient Distributionally Robust Decentralized Learning

Decentralized learning algorithms empower interconnected devices to share data and computational resources to collaboratively train a machine learning model without the aid of a central coordinator. In the case of heterogeneous data…

Machine Learning · Computer Science 2023-01-16 Matteo Zecchin , Marios Kountouris , David Gesbert

An Exact Quantized Decentralized Gradient Descent Algorithm

We consider the problem of decentralized consensus optimization, where the sum of $n$ smooth and strongly convex functions are minimized over $n$ distributed agents that form a connected network. In particular, we consider the case that the…

Machine Learning · Computer Science 2019-10-02 Amirhossein Reisizadeh , Aryan Mokhtari , Hamed Hassani , Ramtin Pedarsani

Error Compensated Quantized SGD and its Applications to Large-scale Distributed Optimization

Large-scale distributed optimization is of great importance in various applications. For data-parallel based distributed learning, the inter-node gradient communication often becomes the performance bottleneck. In this paper, we propose the…

Computer Vision and Pattern Recognition · Computer Science 2018-06-22 Jiaxiang Wu , Weidong Huang , Junzhou Huang , Tong Zhang

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: A Joint Gradient Estimation and Tracking Approach

Many modern large-scale machine learning problems benefit from decentralized and stochastic optimization. Recent works have shown that utilizing both decentralized computing and local stochastic gradient estimates can outperform…

Optimization and Control · Mathematics 2020-11-06 Haoran Sun , Songtao Lu , Mingyi Hong

Optimal Complexity in Non-Convex Decentralized Learning over Time-Varying Networks

Decentralized optimization with time-varying networks is an emerging paradigm in machine learning. It saves remarkable communication overhead in large-scale deep training and is more robust in wireless scenarios especially when nodes are…

Machine Learning · Computer Science 2022-11-02 Xinmeng Huang , Kun Yuan

Decentralized Learning with Lazy and Approximate Dual Gradients

This paper develops algorithms for decentralized machine learning over a network, where data are distributed, computation is localized, and communication is restricted between neighbors. A line of recent research in this area focuses on…

Optimization and Control · Mathematics 2020-08-06 Yanli Liu , Yuejiao Sun , Wotao Yin

CoDGraD: A Code-based Distributed Gradient Descent Scheme for Decentralized Convex Optimization

In this paper, we consider a large network containing many regions such that each region is equipped with a worker with some data processing and communication capability. For such a network, some workers may become stragglers due to the…

Systems and Control · Electrical Eng. & Systems 2022-04-14 Elie Atallah , Nazanin Rahnavard , Qiyu Sun

A Stochastic Proximal Gradient Framework for Decentralized Non-Convex Composite Optimization: Topology-Independent Sample Complexity and Communication Efficiency

Decentralized optimization is a promising parallel computation paradigm for large-scale data analytics and machine learning problems defined over a network of nodes. This paper is concerned with decentralized non-convex composite problems…

Optimization and Control · Mathematics 2021-10-05 Ran Xin , Subhro Das , Usman A. Khan , Soummya Kar

Distributed Stochastic Approximation for Solving Network Optimization Problems Under Random Quantization

We study distributed optimization problems over a network when the communication between the nodes is constrained, and so information that is exchanged between the nodes must be quantized. This imperfect communication poses a fundamental…

Optimization and Control · Mathematics 2018-10-30 Thinh T. Doan , Siva Theja Maguluri , Justin Romberg

Distributed stochastic optimization with large delays

One of the most widely used methods for solving large-scale stochastic optimization problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from parallelizing stochastic gradient descent…

Optimization and Control · Mathematics 2021-07-08 Zhengyuan Zhou , Panayotis Mertikopoulos , Nicholas Bambos , Peter W. Glynn , Yinyu Ye

Decentralized Federated Learning: Balancing Communication and Computing Costs

Decentralized stochastic gradient descent (SGD) is a driving engine for decentralized federated learning (DFL). The performance of decentralized SGD is jointly influenced by inter-node communications and local updates. In this paper, we…

Machine Learning · Computer Science 2022-02-14 Wei Liu , Li Chen , Wenyi Zhang

Scaling up Stochastic Gradient Descent for Non-convex Optimisation

Stochastic gradient descent (SGD) is a widely adopted iterative method for optimizing differentiable objective functions. In this paper, we propose and discuss a novel approach to scale up SGD in applications involving non-convex functions…

Machine Learning · Statistics 2022-10-07 Saad Mohamad , Hamad Alamri , Abdelhamid Bouchachia

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

Parallel implementations of stochastic gradient descent (SGD) have received significant research attention, thanks to excellent scalability properties of this algorithm, and to its efficiency in the context of training deep neural networks.…

Machine Learning · Computer Science 2017-12-07 Dan Alistarh , Demjan Grubic , Jerry Li , Ryota Tomioka , Milan Vojnovic

Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks

Distributed stochastic non-convex optimization problems have recently received attention due to the growing interest of signal processing, computer vision, and natural language processing communities in applications deployed over…

Systems and Control · Electrical Eng. & Systems 2024-10-14 Yiyue Chen , Abolfazl Hashemi , Haris Vikalo

A Stochastic Gradient Tracking Algorithm for Decentralized Optimization With Inexact Communication

Decentralized optimization is typically studied under the assumption of noise-free transmission. However, real-world scenarios often involve the presence of noise due to factors such as additive white Gaussian noise channels or…

Optimization and Control · Mathematics 2023-07-28 Suhail M. Shah , Raghu Bollapragada

Communication-Efficient Distributed SGD with Compressed Sensing

We consider large scale distributed optimization over a set of edge devices connected to a central server, where the limited communication bandwidth between the server and edge devices imposes a significant bottleneck for the optimization…

Optimization and Control · Mathematics 2021-12-28 Yujie Tang , Vikram Ramanathan , Junshan Zhang , Na Li

DR-DSGD: A Distributionally Robust Decentralized Learning Algorithm over Graphs

In this paper, we propose to solve a regularized distributionally robust learning problem in the decentralized setting, taking into account the data distribution shift. By adding a Kullback-Liebler regularization function to the robust…

Machine Learning · Computer Science 2022-09-13 Chaouki Ben Issaid , Anis Elgabli , Mehdi Bennis

Asynchronous Decentralized Learning over Unreliable Wireless Networks

Decentralized learning enables edge users to collaboratively train models by exchanging information via device-to-device communication, yet prior works have been limited to wireless networks with fixed topologies and reliable workers. In…

Information Theory · Computer Science 2022-02-03 Eunjeong Jeong , Matteo Zecchin , Marios Kountouris

DQ-SGD: Dynamic Quantization in SGD for Communication-Efficient Distributed Learning

Gradient quantization is an emerging technique in reducing communication costs in distributed learning. Existing gradient quantization algorithms often rely on engineering heuristics or empirical observations, lacking a systematic approach…

Machine Learning · Computer Science 2021-08-02 Guangfeng Yan , Shao-Lun Huang , Tian Lan , Linqi Song