Related papers: Last-iterate convergence rates for min-max optimiz…
Optimization problem, which is aimed at finding the global minimal value of a given cost function, is one of the central problem in science and engineering. Various numerical methods have been proposed to solve this problem, among which the…
This study investigates leveraging stochastic gradient descent (SGD) to learn operators between general Hilbert spaces. We propose weak and strong regularity conditions for the target operator to depict its intrinsic structure and…
This paper addresses the bilinearly coupled minimax optimization problem: $\min_{x \in \mathbb{R}^{d_x}}\max_{y \in \mathbb{R}^{d_y}} \ f_1(x) + f_2(x) + y^{\top} Bx - g_1(y) - g_2(y)$, where $f_1$ and $g_1$ are smooth convex functions,…
Averaging scheme has attracted extensive attention in deep learning as well as traditional machine learning. It achieves theoretically optimal convergence and also improves the empirical model performance. However, there is still a lack of…
Selecting an effective step-size is a fundamental challenge in first-order optimization, especially for problems with non-Euclidean geometries. This paper presents a novel adaptive step-size strategy for optimization algorithms that rely on…
In this paper, we analyze the recently proposed stochastic primal-dual hybrid gradient (SPDHG) algorithm and provide new theoretical results. In particular, we prove almost sure convergence of the iterates to a solution with convexity and…
In this work, we analyze two of the most fundamental algorithms in geodesically convex optimization: Riemannian gradient descent and (possibly inexact) Riemannian proximal point. We quantify their rates of convergence and produce different…
Many machine learning problems can be formulated as minimax problems such as Generative Adversarial Networks (GANs), AUC maximization and robust estimation, to mention but a few. A substantial amount of studies are devoted to studying the…
In this paper, we study the convergence properties of a randomized block-coordinate descent algorithm for the minimization of a composite convex objective function, where the block-coordinates are updated asynchronously and randomly…
Last-iterate behaviors of learning algorithms in repeated two-player zero-sum games have been extensively studied due to their wide applications in machine learning and related tasks. Typical algorithms that exhibit the last-iterate…
Gradient-based iterative optimization methods are the workhorse of modern machine learning. They crucially rely on careful tuning of parameters like learning rate and momentum. However, one typically sets them using heuristic approaches…
Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule out convergence to critical points of…
We propose a random coordinate descent algorithm for optimizing a non-convex objective function subject to one linear constraint and simple bounds on the variables. Although it is common use to update only two random coordinates…
In this paper, we establish new convergence results for the quantized distributed gradient descent and suggest a novel strategy of choosing the stepsizes for the high-performance of the algorithm. Under the strongly convexity assumption on…
Motivated by recent work on stochastic gradient descent methods, we develop two stochastic variants of greedy algorithms for possibly non-convex optimization problems with sparsity constraints. We prove linear convergence in expectation to…
This article derives lower bounds on the convergence rate of continuous-time gradient-based optimization algorithms. The algorithms are subjected to a time-normalization constraint that avoids a reparametrization of time in order to make…
Stochastic Gradient Descent (SGD) is a fundamental algorithm in machine learning, representing the optimization backbone for training several classic models, from regression to neural networks. Given the recent practical focus on…
The incremental aggregated gradient algorithm is popular in network optimization and machine learning research. However, the current convergence results require the objective function to be strongly convex. And the existing convergence…
Nonconvex-concave min-max problem arises in many machine learning applications including minimizing a pointwise maximum of a set of nonconvex functions and robust adversarial training of neural networks. A popular approach to solve this…
Many supervised machine learning methods are naturally cast as optimization problems. For prediction models which are linear in their parameters, this often leads to convex problems for which many mathematical guarantees exist. Models which…