Related papers: Last-iterate convergence rates for min-max optimiz…

On the Convergence Rate of Incremental Aggregated Gradient Algorithms

Motivated by applications to distributed optimization over networks and large-scale data processing in machine learning, we analyze the deterministic incremental aggregated gradient method for minimizing a finite sum of smooth functions…

Optimization and Control · Mathematics 2018-01-16 Mert Gurbuzbalaban , Asuman Ozdaglar , Pablo Parrilo

The Step Decay Schedule: A Near Optimal, Geometrically Decaying Learning Rate Procedure For Least Squares

Minimax optimal convergence rates for classes of stochastic convex optimization problems are well characterized, where the majority of results utilize iterate averaged stochastic gradient descent (SGD) with polynomially decaying step sizes.…

Machine Learning · Computer Science 2019-10-30 Rong Ge , Sham M. Kakade , Rahul Kidambi , Praneeth Netrapalli

A Newton-CG Algorithm with Complexity Guarantees for Smooth Unconstrained Optimization

We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton's method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the…

Optimization and Control · Mathematics 2018-11-14 Clément W. Royer , Michael O'Neill , Stephen J. Wright

Accelerated and nonaccelerated stochastic gradient descent with inexact model

In this paper, we propose a new way to obtain optimal convergence rates for smooth stochastic (strong) convex optimization tasks. Our approach is based on results for optimization tasks where gradients have nonrandom noise. In contrast to…

Optimization and Control · Mathematics 2020-04-16 Darina Dvinskikh , Alexander Tyurin , Alexander Gasnikov , Sergey Omelchenko

Random minibatch subgradient algorithms for convex problems with functional constraints

In this paper we consider non-smooth convex optimization problems with (possibly) infinite intersection of constraints. In contrast to the classical approach, where the constraints are usually represented as intersection of simple sets,…

Optimization and Control · Mathematics 2024-01-11 Angelia Nedich , Ion Necoara

Convergence Error Analysis of Reflected Gradient Langevin Dynamics for Globally Optimizing Non-Convex Constrained Problems

Gradient Langevin dynamics and a variety of its variants have attracted increasing attention owing to their convergence towards the global optimal solution, initially in the unconstrained convex framework while recently even in convex…

Optimization and Control · Mathematics 2024-08-15 Kanji Sato , Akiko Takeda , Reiichiro Kawai , Taiji Suzuki

Alternating proximal-gradient steps for (stochastic) nonconvex-concave minimax problems

Minimax problems of the form $\min_x \max_y \Psi(x,y)$ have attracted increased interest largely due to advances in machine learning, in particular generative adversarial networks. These are typically trained using variants of stochastic…

Optimization and Control · Mathematics 2023-04-14 Radu Ioan Boţ , Axel Böhm

Stochastic Gradient Descent with Biased but Consistent Gradient Estimators

Stochastic gradient descent (SGD), which dates back to the 1950s, is one of the most popular and effective approaches for performing stochastic optimization. Research on SGD resurged recently in machine learning for optimizing convex loss…

Machine Learning · Computer Science 2019-12-24 Jie Chen , Ronny Luss

Zeroth-Order Algorithms for Nonconvex Minimax Problems with Improved Complexities

In this paper, we study zeroth-order algorithms for minimax optimization problems that are nonconvex in one variable and strongly-concave in the other variable. Such minimax optimization problems have attracted significant attention lately…

Machine Learning · Statistics 2022-04-06 Zhongruo Wang , Krishnakumar Balasubramanian , Shiqian Ma , Meisam Razaviyayn

Revisiting Stochastic Gradient Descent for Strongly Convex Objectives: Tight Uniform-in-Time Bounds

Stochastic optimization via Stochastic Gradient Descent (SGD) is a fundamental problem in statistics and optimization. This paper revisits Stochastic Gradient Descent (SGD) for strongly convex objectives, establishing tight, uniform-in-time…

Optimization and Control · Mathematics 2026-03-19 Kang Chen , Yasong Feng , Tianyu Wang

STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games

Min-max optimization problems involving nonconvex-nonconcave objectives have found important applications in adversarial training and other multi-agent learning settings. Yet, no known gradient descent-based method is guaranteed to converge…

Machine Learning · Computer Science 2022-10-19 Constantinos Daskalakis , Noah Golowich , Stratis Skoulakis , Manolis Zampetakis

MinMax Networks

While much progress has been achieved over the last decades in neuro-inspired machine learning, there are still fundamental theoretical problems in gradient-based learning using combinations of neurons. These problems, such as saddle points…

Machine Learning · Computer Science 2023-06-16 Winfried Lohmiller , Philipp Gassert , Jean-Jacques Slotine

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

We study continual learning on multiple linear classification tasks by sequentially running gradient descent (GD) for a fixed budget of iterations per task. When all tasks are jointly linearly separable and are presented in a cyclic/random…

Machine Learning · Computer Science 2025-04-29 Hyunji Jung , Hanseul Cho , Chulhee Yun

On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization

Adam-type methods, the extension of adaptive gradient methods, have shown great performance in the training of both supervised and unsupervised machine learning models. In particular, Adam-type optimizers have been widely used empirically…

Machine Learning · Computer Science 2021-09-30 Zehao Dou , Yuanzhi Li

A Variant of Gradient Descent Algorithm Based on Gradient Averaging

In this work, we study an optimizer, Grad-Avg to optimize error functions. We establish the convergence of the sequence of iterates of Grad-Avg mathematically to a minimizer (under boundedness assumption). We apply Grad-Avg along with some…

Machine Learning · Computer Science 2020-12-11 Saugata Purkayastha , Sukannya Purkayastha

Quantum Optimization via Gradient-Based Hamiltonian Descent

With rapid advancements in machine learning, first-order algorithms have emerged as the backbone of modern optimization techniques, owing to their computational efficiency and low memory requirements. Recently, the connection between…

Quantum Physics · Physics 2025-05-21 Jiaqi Leng , Bin Shi

Structured Gradient Descent for Fast Robust Low-Rank Hankel Matrix Completion

We study the robust matrix completion problem for the low-rank Hankel matrix, which detects the sparse corruptions caused by extreme outliers while we try to recover the original Hankel matrix from the partial observation. In this paper, we…

Information Theory · Computer Science 2025-04-17 HanQin Cai , Jian-Feng Cai , Juntao You

Almost Sure Convergence of Distributed Optimization with Imperfect Information Sharing

To design algorithms that reduce communication cost or meet rate constraints and are robust to communication noise, we study convex distributed optimization problems where a set of agents are interested in solving a separable optimization…

Optimization and Control · Mathematics 2023-05-02 Hadi Reisizadeh , Anand Gokhale , Behrouz Touri , Soheil Mohajer

A subgradient method with constant step-size for $\ell_1$-composite optimization

Subgradient methods are the natural extension to the non-smooth case of the classical gradient descent for regular convex optimization problems. However, in general, they are characterized by slow convergence rates, and they require…

Optimization and Control · Mathematics 2023-11-20 Alessandro Scagliotti , Piero Colli Franzone

Computationally Efficient and Statistically Optimal Robust High-Dimensional Linear Regression

High-dimensional linear regression under heavy-tailed noise or outlier corruption is challenging, both computationally and statistically. Convex approaches have been proven statistically optimal but suffer from high computational costs,…

Statistics Theory · Mathematics 2023-05-11 Yinan Shen , Jingyang Li , Jian-Feng Cai , Dong Xia