Related papers: Last-iterate convergence rates for min-max optimiz…

Linear Convergence of Primal-Dual Gradient Methods and their Performance in Distributed Optimization

In this work, we revisit a classical incremental implementation of the primal-descent dual-ascent gradient method used for the solution of equality constrained optimization problems. We provide a short proof that establishes the linear…

Optimization and Control · Mathematics 2020-01-17 Sulaiman A. Alghunaim , Ali H. Sayed

Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates

Distributed and federated learning algorithms and techniques associated primarily with minimization problems. However, with the increase of minimax optimization and variational inequality problems in machine learning, the necessity of…

Optimization and Control · Mathematics 2024-06-04 Siqi Zhang , Sayantan Choudhury , Sebastian U Stich , Nicolas Loizou

A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks

We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x \mapsto W_N W_{N-1} \cdots W_1 x$) by minimizing the $\ell_2$ loss over whitened data. Convergence at a linear…

Machine Learning · Computer Science 2019-10-29 Sanjeev Arora , Nadav Cohen , Noah Golowich , Wei Hu

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

Asynchronous algorithms have attracted much attention recently due to the crucial demands on solving large-scale optimization problems. However, the accelerated versions of asynchronous algorithms are rarely studied. In this paper, we…

Optimization and Control · Mathematics 2018-02-28 Cong Fang , Yameng Huang , Zhouchen Lin

Convergence of Some Convex Message Passing Algorithms to a Fixed Point

A popular approach to the MAP inference problem in graphical models is to minimize an upper bound obtained from a dual linear programming or Lagrangian relaxation by (block-)coordinate descent. This is also known as convex/convergent…

Artificial Intelligence · Computer Science 2024-06-06 Vaclav Voracek , Tomas Werner

On Convergence of Gradient Descent Ascent: A Tight Local Analysis

Gradient Descent Ascent (GDA) methods are the mainstream algorithms for minimax optimization in generative adversarial networks (GANs). Convergence properties of GDA have drawn significant interest in the recent literature. Specifically,…

Optimization and Control · Mathematics 2022-07-05 Haochuan Li , Farzan Farnia , Subhro Das , Ali Jadbabaie

Adaptive Accelerated Gradient Converging Methods under Holderian Error Bound Condition

Recent studies have shown that proximal gradient (PG) method and accelerated gradient method (APG) with restarting can enjoy a linear convergence under a weaker condition than strong convexity, namely a quadratic growth condition (QGC).…

Optimization and Control · Mathematics 2017-05-16 Mingrui Liu , Tianbao Yang

A Provably Communication-Efficient Asynchronous Distributed Inference Method for Convex and Nonconvex Problems

This paper proposes and analyzes a communication-efficient distributed optimization framework for general nonconvex nonsmooth signal processing and machine learning problems under an asynchronous protocol. At each iteration, worker machines…

Optimization and Control · Mathematics 2020-07-15 Jineng Ren , Jarvis Haupt

A Homogeneous Second-Order Descent Ascent Algorithm for Nonconvex-Strongly Concave Minimax Problems

This paper introduces a novel Homogeneous Second-order Descent Ascent (HSDA) algorithm for nonconvex-strongly concave minimax optimization problems. At each iteration, HSDA uniquely computes a search direction by solving a homogenized…

Optimization and Control · Mathematics 2026-02-17 Jia-Hao Chen , Zi Xu , Hui-Ling Zhang

Cogradient Descent for Bilinear Optimization

Conventional learning methods simplify the bilinear model by regarding two intrinsically coupled factors independently, which degrades the optimization procedure. One reason lies in the insufficient training due to the asynchronous gradient…

Computer Vision and Pattern Recognition · Computer Science 2020-06-17 Li'an Zhuo , Baochang Zhang , Linlin Yang , Hanlin Chen , Qixiang Ye , David Doermann , Guodong Guo , Rongrong Ji

Negative Stepsizes Make Gradient-Descent-Ascent Converge

Efficient computation of min-max problems is a central question in optimization, learning, games, and controls. Arguably the most natural algorithm is gradient-descent-ascent (GDA). However, since the 1970s, conventional wisdom has argued…

Optimization and Control · Mathematics 2025-05-05 Henry Shugart , Jason M. Altschuler

Gradient Methods with Memory

In this paper, we consider gradient methods for minimizing smooth convex functions, which employ the information obtained at the previous iterations in order to accelerate the convergence towards the optimal solution. This information is…

Optimization and Control · Mathematics 2021-06-02 Yurii Nesterov , Mihai I. Florea

Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with Variance Reduction and its Application to Optimization

The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve sampling problems and non-convex optimization appearing in several machine learning applications. Especially, its variance reduced versions have…

Machine Learning · Computer Science 2022-11-22 Yuri Kinoshita , Taiji Suzuki

On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization

This paper studies a class of adaptive gradient based momentum algorithms that update the search directions and learning rates simultaneously using past gradients. This class, which we refer to as the "Adam-type", includes the popular…

Machine Learning · Computer Science 2019-03-12 Xiangyi Chen , Sijia Liu , Ruoyu Sun , Mingyi Hong

Data-compatibility of algorithms

The data-compatibility approach to constrained optimization, proposed here, strives to a point that is "close enough" to the solution set and whose target function value is "close enough" to the constrained minimum value. These notions can…

Optimization and Control · Mathematics 2020-10-26 Yair Censor , Maroun Zaknoon , Alexander J. Zaslavski

Last-Iterate Complexity of SGD for Convex and Smooth Stochastic Problems

Most results on Stochastic Gradient Descent (SGD) in the convex and smooth setting are presented under the form of bounds on the ergodic function value gap. It is an open question whether bounds can be derived directly on the last iterate…

Optimization and Control · Mathematics 2025-07-21 Guillaume Garrigos , Daniel Cortild , Lucas Ketels , Juan Peypouquet

A H\"olderian backtracking method for min-max and min-min problems

We present a new algorithm to solve min-max or min-min problems out of the convex world. We use rigidity assumptions, ubiquitous in learning, making our method applicable to many optimization problems. Our approach takes advantage of hidden…

Machine Learning · Computer Science 2020-07-20 Jérôme Bolte , Lilian Glaudin , Edouard Pauwels , Mathieu Serrurier

Faster Convergence of a Randomized Coordinate Descent Method for Linearly Constrained Optimization Problems

The problem of minimizing a separable convex function under linearly coupled constraints arises from various application domains such as economic systems, distributed control, and network flow. The main challenge for solving this problem is…

Optimization and Control · Mathematics 2017-09-05 Qin Fan , Min Xu , Yiming Ying

A New Primal-Dual Algorithm for a Class of Nonlinear Compositional Convex Optimization Problems

We develop a novel primal-dual algorithm to solve a class of nonsmooth and nonlinear compositional convex minimization problems, which covers many existing and brand-new models as special cases. Our approach relies on a combination of a new…

Optimization and Control · Mathematics 2021-04-20 Yuzixuan Zhu , Deyi Liu , Quoc Tran-Dinh

A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates

We provide tight finite-time convergence bounds for gradient descent and stochastic gradient descent on quadratic functions, when the gradients are delayed and reflect iterates from $\tau$ rounds ago. First, we show that without stochastic…

Optimization and Control · Mathematics 2018-06-28 Yossi Arjevani , Ohad Shamir , Nathan Srebro