Related papers: Universal Gradient Methods for Stochastic Convex O…

Universality of AdaGrad Stepsizes for Stochastic Optimization: Inexact Oracle, Acceleration and Variance Reduction

We present adaptive gradient methods (both basic and accelerated) for solving convex composite optimization problems in which the main part is approximately smooth (a.k.a. $(\delta, L)$-smooth) and can be accessed only via a (potentially…

Optimization and Control · Mathematics 2024-06-11 Anton Rodomanov , Xiaowen Jiang , Sebastian Stich

Making Gradient Descent Optimal for Strongly Convex Stochastic Optimization

Stochastic gradient descent (SGD) is a simple and popular method to solve stochastic optimization problems which arise in machine learning. For strongly convex problems, its convergence rate was known to be O(\log(T)/T), by running SGD for…

Machine Learning · Computer Science 2015-03-19 Alexander Rakhlin , Ohad Shamir , Karthik Sridharan

Online and Stochastic Universal Gradient Methods for Minimizing Regularized H\"older Continuous Finite Sums

Online and stochastic gradient methods have emerged as potent tools in large scale optimization with both smooth convex and nonsmooth convex problems from the classes $C^{1,1}(\reals^p)$ and $C^{1,0}(\reals^p)$ respectively. However to our…

Numerical Analysis · Mathematics 2014-10-30 Ziqiang Shi , Rujie Liu

ReSQueing Parallel and Private Stochastic Convex Optimization

We introduce a new tool for stochastic convex optimization (SCO): a Reweighted Stochastic Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. Combining ReSQue with recent advances in ball…

Optimization and Control · Mathematics 2023-10-30 Yair Carmon , Arun Jambulapati , Yujia Jin , Yin Tat Lee , Daogao Liu , Aaron Sidford , Kevin Tian

Optimized convergence of stochastic gradient descent by weighted averaging

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the…

Optimization and Control · Mathematics 2022-10-06 Melinda Hagedorn , Florian Jarre

Stochastic Subgradient Algorithms for Strongly Convex Optimization over Distributed Networks

We study diffusion and consensus based optimization of a sum of unknown convex objective functions over distributed networks. The only access to these functions is through stochastic gradient oracles, each of which is only available at a…

Numerical Analysis · Computer Science 2015-09-01 N. Denizcan Vanli , Muhammed O. Sayin , Suleyman S. Kozat

A Retrospective Approximation Approach for Smooth Stochastic Optimization

Stochastic Gradient (SG) is the defacto iterative technique to solve stochastic optimization (SO) problems with a smooth (non-convex) objective $f$ and a stochastic first-order oracle. SG's attractiveness is due in part to its simplicity of…

Optimization and Control · Mathematics 2024-03-08 David Newton , Raghu Bollapragada , Raghu Pasupathy , Nung Kwan Yip

Convergence Analysis of Stochastic Accelerated Gradient Methods for Generalized Smooth Optimizations

We investigate the Randomized Stochastic Accelerated Gradient (RSAG) method, utilizing either constant or adaptive step sizes, for stochastic optimization problems with generalized smooth objective functions. Under relaxed affine variance…

Optimization and Control · Mathematics 2025-02-25 Chenhao Yu , Yusu Hong , Junhong Lin

Stochastic Gradient Descent in the Viewpoint of Graduated Optimization

Stochastic gradient descent (SGD) method is popular for solving non-convex optimization problems in machine learning. This work investigates SGD from a viewpoint of graduated optimization, which is a widely applied approach for non-convex…

Optimization and Control · Mathematics 2023-08-15 Da Li , Jingjing Wu , Qingrun Zhang

Optimal Rates for $O(1)$-Smooth DP-SCO with a Single Epoch and Large Batches

In this paper we revisit the DP stochastic convex optimization (SCO) problem. For convex smooth losses, it is well-known that the canonical DP-SGD (stochastic gradient descent) achieves the optimal rate of $O\left(\frac{LR}{\sqrt{n}} +…

Machine Learning · Computer Science 2024-10-04 Christopher A. Choquette-Choo , Arun Ganesh , Abhradeep Thakurta

Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions

Although stochastic gradient descent (SGD) method and its variants (e.g., stochastic momentum methods, AdaGrad) are the choice of algorithms for solving non-convex problems (especially deep learning), there still remain big gaps between the…

Optimization and Control · Mathematics 2019-03-07 Zaiyi Chen , Zhuoning Yuan , Jinfeng Yi , Bowen Zhou , Enhong Chen , Tianbao Yang

Stochastic smoothing accelerated gradient method for general constrained nonsmooth convex composite optimization

We propose a novel stochastic smoothing accelerated gradient (SSAG) method for general constrained nonsmooth convex composite optimization, and analyze the convergence rates. The SSAG method allows various smoothing techniques, and can deal…

Optimization and Control · Mathematics 2026-02-03 Ruyu Wang , Chao Zhang

A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets

We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed…

Optimization and Control · Mathematics 2013-03-12 Nicolas Le Roux , Mark Schmidt , Francis Bach

How To Make the Gradients Small Stochastically: Even Faster Convex and Nonconvex SGD

Stochastic gradient descent (SGD) gives an optimal convergence rate when minimizing convex stochastic objectives $f(x)$. However, in terms of making the gradients small, the original SGD does not give an optimal rate, even when $f(x)$ is…

Machine Learning · Computer Science 2021-07-30 Zeyuan Allen-Zhu

Hybrid Stochastic Gradient Descent Algorithms for Stochastic Nonconvex Optimization

We introduce a hybrid stochastic estimator to design stochastic gradient algorithms for solving stochastic optimization problems. Such a hybrid estimator is a convex combination of two existing biased and unbiased estimators and leads to…

Optimization and Control · Mathematics 2019-05-16 Quoc Tran-Dinh , Nhan H. Pham , Dzung T. Phan , Lam M. Nguyen

High Probability Convergence of Stochastic Gradient Methods

In this work, we describe a generic approach to show convergence with high probability for both stochastic convex and non-convex optimization with sub-Gaussian noise. In previous works for convex optimization, either the convergence is only…

Optimization and Control · Mathematics 2023-03-01 Zijian Liu , Ta Duy Nguyen , Thien Hang Nguyen , Alina Ene , Huy Lê Nguyen

When Does Stochastic Gradient Algorithm Work Well?

In this paper, we consider a general stochastic optimization problem which is often at the core of supervised learning, such as deep learning and linear classification. We consider a standard stochastic gradient descent (SGD) method with a…

Machine Learning · Statistics 2018-12-27 Lam M. Nguyen , Nam H. Nguyen , Dzung T. Phan , Jayant R. Kalagnanam , Katya Scheinberg

Stochastic Gradient Descent for Non-smooth Optimization: Convergence Results and Optimal Averaging Schemes

Stochastic Gradient Descent (SGD) is one of the simplest and most popular stochastic optimization methods. While it has already been theoretically studied for decades, the classical analysis usually required non-trivial smoothness…

Machine Learning · Computer Science 2013-01-01 Ohad Shamir , Tong Zhang

Stochastic Conditional Gradient Methods: From Convex Minimization to Submodular Maximization

This paper considers stochastic optimization problems for a large class of objective functions, including convex and continuous submodular. Stochastic proximal gradient methods have been widely used to solve such problems; however, their…

Optimization and Control · Mathematics 2018-11-13 Aryan Mokhtari , Hamed Hassani , Amin Karbasi

A universal black-box optimization method with almost dimension-free convergence rate guarantees

Universal methods for optimization are designed to achieve theoretically optimal convergence rates without any prior knowledge of the problem's regularity parameters or the accurarcy of the gradient oracle employed by the optimizer. In this…

Optimization and Control · Mathematics 2022-06-22 Kimon Antonakopoulos , Dong Quan Vu , Vokan Cevher , Kfir Y. Levy , Panayotis Mertikopoulos