Related papers: Nesterov-aided Stochastic Gradient Methods using L…

Deterministic and Stochastic Accelerated Gradient Method for Convex Semi-Infinite Optimization

This paper explores numerical methods for solving a convex differentiable semi-infinite program. We introduce a primal-dual gradient method which performs three updates iteratively: a momentum gradient ascend step to update the constraint…

Optimization and Control · Mathematics 2024-07-23 Yao Yao , Qihang Lin , Tianbao Yang

Stochastic gradient descent method with convex penalty for ill-posed problems in Banach spaces

In this work, we investigate a stochastic gradient descent method for solving inverse problems that can be written as systems of linear or nonlinear ill-posed equations in Banach spaces. The method uses only a randomly selected equation at…

Numerical Analysis · Mathematics 2024-09-10 Ruixue Gu , Zhenwu Fu , Bo Han , Hongsun Fu

From differential equation solvers to accelerated first-order methods for convex optimization

Convergence analysis of accelerated first-order methods for convex optimization problems are presented from the point of view of ordinary differential equation solvers. A new dynamical system, called Nesterov accelerated gradient flow, has…

Optimization and Control · Mathematics 2022-03-01 Hao Luo , Long Chen

An Optimal Multistage Stochastic Gradient Method for Minimax Problems

In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent…

Optimization and Control · Mathematics 2020-02-14 Alireza Fallah , Asuman Ozdaglar , Sarath Pattathil

Multi-Level Composite Stochastic Optimization via Nested Variance Reduction

We consider multi-level composite optimization problems where each mapping in the composition is the expectation over a family of random smooth mappings or the sum of some finite number of smooth mappings. We present a normalized proximal…

Optimization and Control · Mathematics 2021-05-12 Junyu Zhang , Lin Xiao

A Variance Controlled Stochastic Method with Biased Estimation for Faster Non-convex Optimization

In this paper, we proposed a new technique, {\em variance controlled stochastic gradient} (VCSG), to improve the performance of the stochastic variance reduced gradient (SVRG) algorithm. To avoid over-reducing the variance of gradient by…

Machine Learning · Computer Science 2021-02-22 Jia Bi , Steve R. Gunn

Variance-Reduced Gradient Estimator for Nonconvex Zeroth-Order Distributed Optimization

This paper investigates distributed zeroth-order optimization for smooth nonconvex problems, targeting the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation in current algorithms that use either the…

Optimization and Control · Mathematics 2026-04-10 Huaiyi Mu , Yujie Tang , Jie Song , Zhongkui Li

Accelerated and nonaccelerated stochastic gradient descent with model conception

In this paper, we describe a new way to get convergence rates for optimal methods in smooth (strongly) convex optimization tasks. Our approach is based on results for tasks where gradients have nonrandom small noises. Unlike previous…

Optimization and Control · Mathematics 2020-07-14 Darina Dvinskikh , Alexander Tyurin , Alexander Gasnikov , Sergey Omelchenko

Accelerated Gradient Methods for Sparse Statistical Learning with Nonconvex Penalties

Nesterov's accelerated gradient (AG) is a popular technique to optimize objective functions comprising two components: a convex loss and a penalty function. While AG methods perform well for convex penalties, such as the LASSO, convergence…

Optimization and Control · Mathematics 2024-01-04 Kai Yang , Masoud Asgharian , Sahir Bhatnagar

A stochastic gradient method for trilevel optimization

With the success that the field of bilevel optimization has seen in recent years, similar methodologies have started being applied to solving more difficult applications that arise in trilevel optimization. At the helm of these applications…

Optimization and Control · Mathematics 2025-05-13 Tommaso Giovannelli , Griffin Dean Kent , Luis Nunes Vicente

Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo

Bayesian Neural Networks (BNNs) provide a promising framework for modeling predictive uncertainty and enhancing out-of-distribution robustness (OOD) by estimating the posterior distribution of network parameters. Stochastic Gradient Markov…

Machine Learning · Computer Science 2025-03-04 Hyunsu Kim , Giung Nam , Chulhee Yun , Hongseok Yang , Juho Lee

Integrated Nested Laplace Approximations for Large-Scale Spatial-Temporal Bayesian Modeling

Bayesian inference tasks continue to pose a computational challenge. This especially holds for spatial-temporal modeling where high-dimensional latent parameter spaces are ubiquitous. The methodology of integrated nested Laplace…

Computation · Statistics 2023-03-28 Lisa Gaedke-Merzhäuser , Elias Krainski , Radim Janalik , Håvard Rue , Olaf Schenk

Reinforced stochastic gradient descent for deep neural network learning

Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the…

Machine Learning · Computer Science 2017-11-23 Haiping Huang , Taro Toyoizumi

Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks

We study the convergence rate of first-order methods for rectangular matrix factorization, which is a canonical nonconvex optimization problem. Specifically, given a rank-$r$ matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$, we prove that…

Machine Learning · Computer Science 2024-12-03 Zhenghao Xu , Yuqing Wang , Tuo Zhao , Rachel Ward , Molei Tao

Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance

In this paper, a general stochastic optimization procedure is studied, unifying several variants of the stochastic gradient descent such as, among others, the stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient…

Optimization and Control · Mathematics 2021-07-13 A. Barakat , P. Bianchi , W. Hachem , Sh. Schechtman

On Adapting Nesterov's Scheme to Accelerate Iterative Methods for Linear Problems

Nesterov's well-known scheme for accelerating gradient descent in convex optimization problems is adapted to accelerating stationary iterative solvers for linear systems. Compared with classical Krylov subspace acceleration methods, the…

Optimization and Control · Mathematics 2021-08-10 Tao Hong , Irad Yavneh

Stochastic Augmented Lagrangian Method in Riemannian Shape Manifolds

In this paper, we present a stochastic augmented Lagrangian approach on (possibly infinite-dimensional) Riemannian manifolds to solve stochastic optimization problems with a finite number of deterministic constraints.We investigate the…

Optimization and Control · Mathematics 2025-04-01 Caroline Geiersbach , Tim Suchan , Kathrin Welker

Laplacian Autoencoders for Learning Stochastic Representations

Established methods for unsupervised representation learning such as variational autoencoders produce none or poorly calibrated uncertainty estimates making it difficult to evaluate if learned representations are stable and reliable. In…

Machine Learning · Computer Science 2022-08-24 Marco Miani , Frederik Warburg , Pablo Moreno-Muñoz , Nicke Skafte Detlefsen , Søren Hauberg

Distributed stochastic optimization with large delays

One of the most widely used methods for solving large-scale stochastic optimization problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from parallelizing stochastic gradient descent…

Optimization and Control · Mathematics 2021-07-08 Zhengyuan Zhou , Panayotis Mertikopoulos , Nicholas Bambos , Peter W. Glynn , Yinyu Ye

Momentum SVGD-EM for Accelerated Maximum Marginal Likelihood Estimation

Maximum marginal likelihood estimation (MMLE) can be formulated as the optimization of a free energy functional. From this viewpoint, the Expectation-Maximisation (EM) algorithm admits a natural interpretation as a coordinate descent method…

Machine Learning · Statistics 2026-03-10 Adam Rozzio , Rafael Athanasiades , O. Deniz Akyildiz