English
Related papers

Related papers: Adaptive Conditional Gradient Descent

200 papers

We introduce a new adaptive step-size strategy for convex optimization with stochastic gradient that exploits the local geometry of the objective function only by means of a first-order stochastic oracle and without any hyper-parameter…

Machine Learning · Computer Science 2025-09-19 Jean-François Aujol , Jérémie Bigot , Camille Castera

Stochastic gradient descent is the method of choice for large scale optimization of machine learning objective functions. Yet, its performance is greatly variable and heavily depends on the choice of the stepsizes. This has motivated a…

Machine Learning · Statistics 2019-02-28 Xiaoyu Li , Francesco Orabona

This paper proposes a novel approach to adaptive step sizes in stochastic gradient descent (SGD) by utilizing quantities that we have identified as numerically traceable -- the Lipschitz constant for gradients and a concept of the local…

Optimization and Control · Mathematics 2024-09-19 Frederik Köhne , Leonie Kreis , Anton Schiela , Roland Herzog

We propose adaptive, line search-free second-order methods with optimal rate of convergence for solving convex-concave min-max problems. By means of an adaptive step size, our algorithms feature a simple update rule that requires solving…

Optimization and Control · Mathematics 2024-11-12 Ruichen Jiang , Ali Kavis , Qiujiang Jin , Sujay Sanghavi , Aryan Mokhtari

We study gradient descent (GD) with a constant stepsize for $\ell_2$-regularized logistic regression with linearly separable data. Classical theory suggests small stepsizes to ensure monotonic reduction of the optimization objective,…

Machine Learning · Statistics 2025-11-04 Jingfeng Wu , Pierre Marion , Peter Bartlett

We present an adaptive step-size method, which does not include line-search techniques, for solving a wide class of nonconvex multiobjective programming problems on an unbounded constraint set. We also prove convergence of a general…

Optimization and Control · Mathematics 2024-02-12 Nguyen Anh Minh , Le Dung Muu , Tran Ngoc Thang

We propose an adaptive step size with an energy approach for a suitable class of preconditioned gradient descent methods. We focus on settings where the preconditioning is applied to address the constraints in optimization problems, such as…

Optimization and Control · Mathematics 2024-06-17 Hailiang Liu , Levon Nurbekyan , Xuping Tian , Yunan Yang

In this paper we propose several adaptive gradient methods for stochastic optimization. Unlike AdaGrad-type of methods, our algorithms are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of…

For solving a broad class of nonconvex programming problems on an unbounded constraint set, we provide a self-adaptive step-size strategy that does not include line-search techniques and establishes the convergence of a generic approach…

Optimization and Control · Mathematics 2022-12-14 Thang Tran Ngoc , Hai Trinh Ngoc

We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive (self-tuning) method for first-order stochastic optimization. Despite being well studied, existing analyses of this method suffer from various shortcomings:…

Machine Learning · Computer Science 2023-06-13 Amit Attia , Tomer Koren

This paper proposes a novel proximal-gradient algorithm for a decentralized optimization problem with a composite objective containing smooth and non-smooth terms. Specifically, the smooth and nonsmooth terms are dealt with by gradient and…

Optimization and Control · Mathematics 2021-02-02 Zhi Li , Wei Shi , Ming Yan

We present adaptive gradient methods (both basic and accelerated) for solving convex composite optimization problems in which the main part is approximately smooth (a.k.a. $(\delta, L)$-smooth) and can be accessed only via a (potentially…

Optimization and Control · Mathematics 2024-06-11 Anton Rodomanov , Xiaowen Jiang , Sebastian Stich

Gradient methods are widely used in optimization problems. In practice, while the smoothness parameter can be estimated utilizing techniques such as backtracking, estimating the strong convexity parameter remains a challenge; moreover, even…

Optimization and Control · Mathematics 2026-02-17 Xiaozhe Hu , Sara Pollock , Zhongqin Xue , Yunrong Zhu

Establishing a fast rate of convergence for optimization methods is crucial to their applicability in practice. With the increasing popularity of deep learning over the past decade, stochastic gradient descent and its adaptive variants…

Optimization and Control · Mathematics 2022-01-03 Adityanarayanan Radhakrishnan , Mikhail Belkin , Caroline Uhler

The convergence of stochastic gradient descent is highly dependent on the step-size, especially on non-convex problems such as neural network training. Step decay step-size schedules (constant and then cut) are widely used in practice…

Optimization and Control · Mathematics 2021-02-19 Xiaoyu Wang , Sindri Magnússon , Mikael Johansson

Gradient descent is the primary workhorse for optimizing large-scale problems in machine learning. However, its performance is highly sensitive to the choice of the learning rate. A key limitation of gradient descent is its lack of natural…

Optimization and Control · Mathematics 2025-07-15 Oscar Smee , Fred Roosta , Stephen J. Wright

Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/\mu})$ gradient…

Machine Learning · Computer Science 2023-12-07 Yuanshi Liu , Hanzhen Zhao , Yang Xu , Pengyun Yue , Cong Fang

We study a fixed step-size noisy distributed gradient descent algorithm for solving optimization problems in which the objective is a finite sum of smooth but possibly non-convex functions. Random perturbations are introduced to the…

Optimization and Control · Mathematics 2023-07-21 Lei Qin , Michael Cantoni , Ye Pu

Compressed Stochastic Gradient Descent (SGD) algorithms have been recently proposed to address the communication bottleneck in distributed and decentralized optimization problems, such as those that arise in federated machine learning.…

Machine Learning · Statistics 2022-07-21 Adarsh M. Subramaniam , Akshayaa Magesh , Venugopal V. Veeravalli

Gradient-based iterative optimization methods are the workhorse of modern machine learning. They crucially rely on careful tuning of parameters like learning rate and momentum. However, one typically sets them using heuristic approaches…

Machine Learning · Computer Science 2025-12-05 Dravyansh Sharma
‹ Prev 1 2 3 10 Next ›