Related papers: Faster Adaptive Decentralized Learning Algorithms

An Optimal Algorithm for Decentralized Finite Sum Optimization

Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced…

Optimization and Control · Mathematics 2020-05-22 Hadrien Hendrikx , Francis Bach , Laurent Massoulie

An Accelerated Decentralized Stochastic Proximal Algorithm for Finite Sums

Modern large-scale finite-sum optimization relies on two key aspects: distribution and stochastic updates. For smooth and strongly convex problems, existing decentralized algorithms are slower than modern accelerated variance-reduced…

Optimization and Control · Mathematics 2019-06-13 Hadrien Hendrikx , Francis Bach , Laurent Massoulie

Adaptive Federated Minimax Optimization with Lower Complexities

Federated learning is a popular distributed and privacy-preserving learning paradigm in machine learning. Recently, some federated learning algorithms have been proposed to solve the distributed minimax problems. However, these federated…

Machine Learning · Computer Science 2024-03-01 Feihu Huang , Xinrui Wang , Junyi Li , Songcan Chen

Asynchronous Accelerated Proximal Stochastic Gradient for Strongly Convex Distributed Finite Sums

In this work, we study the problem of minimizing the sum of strongly convex functions split over a network of $n$ nodes. We propose the decentralized and asynchronous algorithm ADFS to tackle the case when local functions are themselves…

Optimization and Control · Mathematics 2019-07-18 Hadrien Hendrikx , Francis Bach , Laurent Massoulié

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: A Joint Gradient Estimation and Tracking Approach

Many modern large-scale machine learning problems benefit from decentralized and stochastic optimization. Recent works have shown that utilizing both decentralized computing and local stochastic gradient estimates can outperform…

Optimization and Control · Mathematics 2020-11-06 Haoran Sun , Songtao Lu , Mingyi Hong

Faster Adaptive Momentum-Based Federated Methods for Distributed Composition Optimization

Federated Learning is a popular distributed learning paradigm in machine learning. Meanwhile, composition optimization is an effective hierarchical learning model, which appears in many machine learning applications such as meta learning…

Machine Learning · Computer Science 2023-03-31 Feihu Huang

An Asynchronous Decentralised Optimisation Algorithm for Nonconvex Problems

In this paper, we consider nonconvex decentralised optimisation and learning over a network of distributed agents. We develop an ADMM algorithm based on the Randomised Block Coordinate Douglas-Rachford splitting method which enables agents…

Optimization and Control · Mathematics 2025-07-31 Behnam Mafakheri , Jonathan H. Manton , Iman Shames

An Accelerated DFO Algorithm for Finite-sum Convex Functions

Derivative-free optimization (DFO) has recently gained a lot of momentum in machine learning, spawning interest in the community to design faster methods for problems where gradients are not accessible. While some attention has been given…

Optimization and Control · Mathematics 2020-08-04 Yuwen Chen , Antonio Orvieto , Aurelien Lucchi

Decentralized Optimization with Distributed Features and Non-Smooth Objective Functions

We develop a new consensus-based distributed algorithm for solving learning problems with feature partitioning and non-smooth convex objective functions. Such learning problems are not separable, i.e., the associated objective functions…

Signal Processing · Electrical Eng. & Systems 2022-08-25 Cristiano Gratton , Naveen K. D. Venkategowda , Reza Arablouei , Stefan Werner

Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities

We provide new adaptive first-order methods for constrained convex optimization. Our main algorithms AdaACSA and AdaAGD+ are accelerated methods, which are universal in the sense that they achieve nearly-optimal convergence rates for both…

Machine Learning · Computer Science 2021-02-17 Alina Ene , Huy L. Nguyen , Adrian Vladu

Fast Adaptive Federated Bilevel Optimization

Bilevel optimization is a popular hierarchical model in machine learning, and has been widely applied to many machine learning tasks such as meta learning, hyperparameter learning and policy optimization. Although many bilevel optimization…

Machine Learning · Computer Science 2022-11-15 Feihu Huang

Multi-consensus Decentralized Accelerated Gradient Descent

This paper considers the decentralized convex optimization problem, which has a wide range of applications in large-scale machine learning, sensor networks, and control theory. We propose novel algorithms that achieve optimal computation…

Machine Learning · Computer Science 2023-10-11 Haishan Ye , Luo Luo , Ziang Zhou , Tong Zhang

DADAM: A Consensus-based Distributed Adaptive Gradient Method for Online Optimization

Adaptive gradient-based optimization methods such as \textsc{Adagrad}, \textsc{Rmsprop}, and \textsc{Adam} are widely used in solving large-scale machine learning problems including deep learning. A number of schemes have been proposed in…

Machine Learning · Computer Science 2019-05-30 Parvin Nazari , Davoud Ataee Tarzanagh , George Michailidis

Communication-Efficient Stochastic Distributed Learning

We address distributed learning problems, both nonconvex and convex, over undirected networks. In particular, we design a novel algorithm based on the distributed Alternating Direction Method of Multipliers (ADMM) to address the challenges…

Machine Learning · Computer Science 2026-03-23 Xiaoxing Ren , Nicola Bastianello , Karl H. Johansson , Thomas Parisini

AdaLoss: A computationally-efficient and provably convergent adaptive gradient method

We propose a computationally-friendly adaptive learning rate schedule, "AdaLoss", which directly uses the information of the loss function to adjust the stepsize in gradient descent methods. We prove that this schedule enjoys linear…

Machine Learning · Statistics 2021-09-20 Xiaoxia Wu , Yuege Xie , Simon Du , Rachel Ward

On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization

Adaptive gradient methods are workhorses in deep learning. However, the convergence guarantees of adaptive gradient methods for nonconvex optimization have not been thoroughly studied. In this paper, we provide a fine-grained convergence…

Machine Learning · Computer Science 2024-06-21 Dongruo Zhou , Jinghui Chen , Yuan Cao , Ziyan Yang , Quanquan Gu

Adaptive Federated Optimization

Federated learning is a distributed machine learning paradigm in which a large number of clients coordinate with a central server to learn a model without sharing their own training data. Standard federated optimization methods such as…

Machine Learning · Computer Science 2021-09-10 Sashank Reddi , Zachary Charles , Manzil Zaheer , Zachary Garrett , Keith Rush , Jakub Konečný , Sanjiv Kumar , H. Brendan McMahan

A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning

Adaptive gradient methods have become popular in optimizing deep neural networks; recent examples include AdaGrad and Adam. Although Adam usually converges faster, variations of Adam, for instance, the AdaBelief algorithm, have been…

Machine Learning · Computer Science 2024-10-29 Kushal Chakrabarti , Nikhil Chopra

Stochastic Optimization from Distributed, Streaming Data in Rate-limited Networks

Motivated by machine learning applications in networks of sensors, internet-of-things (IoT) devices, and autonomous agents, we propose techniques for distributed stochastic convex learning from high-rate data streams. The setup involves a…

Machine Learning · Statistics 2019-06-11 Matthew Nokleby , Waheed U. Bajwa

Efficient Distributed Learning over Decentralized Networks with Convoluted Support Vector Machine

This paper addresses the problem of efficiently classifying high-dimensional data over decentralized networks. Penalized support vector machines (SVMs) are widely used for high-dimensional classification tasks. However, the double…

Machine Learning · Statistics 2025-03-11 Canyi Chen , Nan Qiao , Liping Zhu