Related papers: Last-iterate convergence rates for min-max optimiz…

Stochastic subGradient Methods with Linear Convergence for Polyhedral Convex Optimization

In this paper, we show that simple {Stochastic} subGradient Decent methods with multiple Restarting, named {\bf RSGD}, can achieve a \textit{linear convergence rate} for a class of non-smooth and non-strongly convex optimization problems…

Machine Learning · Computer Science 2016-04-01 Tianbao Yang , Qihang Lin

A stochastic gradient descent algorithm with random search directions

Stochastic coordinate descent algorithms are efficient methods in which each iterate is obtained by fixing most coordinates at their values from the current iteration, and approximately minimizing the objective with respect to the remaining…

Machine Learning · Statistics 2025-04-02 Eméric Gbaguidi

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the form of $\min_\textbf{x} \max_{\textbf{y} \in Y} f(\textbf{x}, \textbf{y})$, where the…

Machine Learning · Computer Science 2025-01-28 Tianyi Lin , Chi Jin , Michael. I. Jordan

On the Linear Convergence of Extra-Gradient Methods for Nonconvex-Nonconcave Minimax Problems

Recently, minimax optimization received renewed focus due to modern applications in machine learning, robust optimization, and reinforcement learning. The scale of these applications naturally leads to the use of first-order methods.…

Optimization and Control · Mathematics 2023-03-07 Saeed Hajizadeh , Haihao Lu , Benjamin Grimmer

Global Optimality in Distributed Low-rank Matrix Factorization

We study the convergence of a variant of distributed gradient descent (DGD) on a distributed low-rank matrix approximation problem wherein some optimization variables are used for consensus (as in classical DGD) and some optimization…

Optimization and Control · Mathematics 2018-12-27 Zhihui Zhu , Qiuwei Li , Xinshuo Yang , Gongguo Tang , Michael B. Wakin

Last Iterate Convergence of Incremental Methods and Applications in Continual Learning

Incremental gradient and incremental proximal methods are a fundamental class of optimization algorithms used for solving finite sum problems, broadly studied in the literature. Yet, without strong convexity, their convergence guarantees…

Optimization and Control · Mathematics 2024-07-01 Xufeng Cai , Jelena Diakonikolas

On the Convergence of Local Descent Methods in Federated Learning

In federated distributed learning, the goal is to optimize a global training objective defined over distributed devices, where the data shard at each device is sampled from a possibly different distribution (a.k.a., heterogeneous or non…

Machine Learning · Computer Science 2019-12-10 Farzin Haddadpour , Mehrdad Mahdavi

Accelerated Single-Call Methods for Constrained Min-Max Optimization

We study first-order methods for constrained min-max optimization. Existing methods either require two gradient calls or two projections in each iteration, which may be costly in some applications. In this paper, we first show that a…

Optimization and Control · Mathematics 2023-05-16 Yang Cai , Weiqiang Zheng

On Centralized and Distributed Mirror Descent: Convergence Analysis Using Quadratic Constraints

Mirror descent (MD) is a powerful first-order optimization technique that subsumes several optimization algorithms including gradient descent (GD). In this work, we develop a semi-definite programming (SDP) framework to analyze the…

Optimization and Control · Mathematics 2022-01-19 Youbang Sun , Mahyar Fazlyab , Shahin Shahrampour

On Faster Convergence of Scaled Sign Gradient Descent

Communication has been seen as a significant bottleneck in industrial applications over large-scale networks. To alleviate the communication burden, sign-based optimization algorithms have gained popularity recently in both industrial and…

Optimization and Control · Mathematics 2021-09-07 Xiuxian Li , Kuo-Yi Lin , Li Li , Yiguang Hong , Jie Chen

Projected subgradient methods for paraconvex optimization: Application to robust low-rank matrix recovery

This paper is devoted to the class of paraconvex functions and presents some of its fundamental properties, characterization, and examples that can be used for their recognition and optimization. Next, the convergence analysis of the…

Optimization and Control · Mathematics 2026-03-06 Morteza Rahimi , Susan Ghaderi , Yves Moreau , Masoud Ahookhosh

A Linearly Convergent Proximal Gradient Algorithm for Decentralized Optimization

Decentralized optimization is a powerful paradigm that finds applications in engineering and learning design. This work studies decentralized composite optimization problems with non-smooth regularization terms. Most existing gradient-based…

Optimization and Control · Mathematics 2019-10-29 Sulaiman A. Alghunaim , Kun Yuan , Ali H. Sayed

Hamiltonian Descent Methods

We propose a family of optimization methods that achieve linear convergence using first-order gradient information and constant step sizes on a class of convex functions much larger than the smooth and strongly convex ones. This larger…

Optimization and Control · Mathematics 2018-09-14 Chris J. Maddison , Daniel Paulin , Yee Whye Teh , Brendan O'Donoghue , Arnaud Doucet

Faster Margin Maximization Rates for Generic and Adversarially Robust Optimization Methods

First-order optimization methods tend to inherently favor certain solutions over others when minimizing an underdetermined training objective that has multiple global optima. This phenomenon, known as implicit bias, plays a critical role in…

Machine Learning · Computer Science 2024-04-09 Guanghui Wang , Zihao Hu , Claudio Gentile , Vidya Muthukumar , Jacob Abernethy

Accelerated Proximal Alternating Gradient-Descent-Ascent for Nonconvex Minimax Machine Learning

Alternating gradient-descent-ascent (AltGDA) is an optimization algorithm that has been widely used for model training in various machine learning applications, which aims to solve a nonconvex minimax optimization problem. However, the…

Machine Learning · Computer Science 2022-05-23 Ziyi Chen , Shaocong Ma , Yi Zhou

Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks

We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a…

Numerical Analysis · Computer Science 2015-09-01 N. Denizcan Vanli , Muhammed O. Sayin , Suleyman S. Kozat

Inexact Riemannian Gradient Descent Method for Nonconvex Optimization

Gradient descent methods are fundamental first-order optimization algorithms in both Euclidean spaces and Riemannian manifolds. However, the exact gradient is not readily available in many scenarios. This paper proposes a novel inexact…

Optimization and Control · Mathematics 2024-09-18 Juan Zhou , Kangkang Deng , Hongxia Wang , Zheng Peng

Augmented Lagrangian Method for Last-Iterate Convergence for Constrained MDPs

We study policy optimization for infinite-horizon, discounted constrained Markov decision processes (CMDPs). While existing theoretical guarantees typically hold for the mixture policy, deploying such a policy is computationally and memory…

Machine Learning · Computer Science 2026-05-13 Michael Lu , Max Qiushi Lin , Mo Chen , Sharan Vaswani

Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a…

Optimization and Control · Mathematics 2021-03-05 Benjamin Grimmer , Haihao Lu , Pratik Worah , Vahab Mirrokni

A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks

When equipped with efficient optimization algorithms, the over-parameterized neural networks have demonstrated high level of performance even though the loss function is non-convex and non-smooth. While many works have been focusing on…

Machine Learning · Computer Science 2021-03-11 Zhiqi Bu , Shiyun Xu , Kan Chen