Related papers: Last-iterate convergence rates for min-max optimiz…
We analyze the last-iterate convergence of the Anchored Gradient Descent Ascent algorithm for smooth convex-concave min-max problems. While previous work established a last-iterate rate of $\mathcal{O}(1/t^{2-2p})$ for the squared gradient…
There are much recent interests in solving noncovnex min-max optimization problems due to its broad applications in many areas including machine learning, networked resource allocations, and distributed optimization. Perhaps, the most…
In the past several years, the last-iterate convergence of the Stochastic Gradient Descent (SGD) algorithm has triggered people's interest due to its good performance in practice but lack of theoretical understanding. For Lipschitz convex…
In a recent series of papers it has been established that variants of Gradient Descent/Ascent and Mirror Descent exhibit last iterate convergence in convex-concave zero-sum games. Specifically, \cite{DISZ17, LiangS18} show last iterate…
Delays and asynchrony are inevitable in large-scale machine-learning problems where communication plays a key role. As such, several works have extensively analyzed stochastic optimization with delayed gradients. However, as far as we are…
Quasi-convex optimization acts a pivotal part in many fields including economics and finance; the subgradient method is an effective iterative algorithm for solving large-scale quasi-convex optimization problems. In this paper, we…
This paper considers the decentralized convex optimization problem, which has a wide range of applications in large-scale machine learning, sensor networks, and control theory. We propose novel algorithms that achieve optimal computation…
Distributed stochastic gradient methods are widely used to preserve data privacy and ensure scalability in large-scale learning tasks. While existing theory on distributed momentum Stochastic Gradient Descent (mSGD) mainly focuses on…
Despite remarkable empirical success, the training dynamics of generative adversarial networks (GAN), which involves solving a minimax game using stochastic gradients, is still poorly understood. In this work, we analyze last-iterate…
We study the convergence of the shuffling gradient method, a popular algorithm employed to minimize the finite-sum function with regularization, in which functions are passed to apply (Proximal) Gradient Descent (GD) one by one whose order…
The paper considers the problem of network-based computation of global minima in smooth nonconvex optimization problems. It is known that distributed gradient-descent-type algorithms can achieve convergence to the set of global minima by…
Motivated by applications in Game Theory, Optimization, and Generative Adversarial Networks, recent work of Daskalakis et al \cite{DISZ17} and follow-up work of Liang and Stokes \cite{LiangS18} have established that a variant of the widely…
In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG…
Several widely-used first-order saddle-point optimization methods yield an identical continuous-time ordinary differential equation (ODE) that is identical to that of the Gradient Descent Ascent (GDA) method when derived naively. However,…
Shuffling gradient methods are widely used in modern machine learning tasks and include three popular implementations: Random Reshuffle (RR), Shuffle Once (SO), and Incremental Gradient (IG). Compared to the empirical success, the…
We study the convergence of the last iterate in subgradient methods applied to the minimization of a nonsmooth convex function with bounded subgradients. We first introduce a proof technique that generalizes the standard analysis of…
Low-rank matrix estimation is a canonical problem that finds numerous applications in signal processing, machine learning and imaging science. A popular approach in practice is to factorize the matrix into two compact low-rank factors, and…
The stochastic momentum method is a commonly used acceleration technique for solving large-scale stochastic optimization problems in artificial neural networks. Current convergence results of stochastic momentum methods under non-convex…
The success of adversarial formulations in machine learning has brought renewed motivation for smooth games. In this work, we focus on the class of stochastic Hamiltonian methods and provide the first convergence guarantees for certain…
From optimal transport to robust dimensionality reduction, a plethora of machine learning applications can be cast into the min-max optimization problems over Riemannian manifolds. Though many min-max algorithms have been analyzed in the…