English
Related papers

Related papers: Nesterov-aided Stochastic Gradient Methods using L…

200 papers

This paper explores numerical methods for solving a convex differentiable semi-infinite program. We introduce a primal-dual gradient method which performs three updates iteratively: a momentum gradient ascend step to update the constraint…

Optimization and Control · Mathematics 2024-07-23 Yao Yao , Qihang Lin , Tianbao Yang

In this work, we investigate a stochastic gradient descent method for solving inverse problems that can be written as systems of linear or nonlinear ill-posed equations in Banach spaces. The method uses only a randomly selected equation at…

Numerical Analysis · Mathematics 2024-09-10 Ruixue Gu , Zhenwu Fu , Bo Han , Hongsun Fu

Convergence analysis of accelerated first-order methods for convex optimization problems are presented from the point of view of ordinary differential equation solvers. A new dynamical system, called Nesterov accelerated gradient flow, has…

Optimization and Control · Mathematics 2022-03-01 Hao Luo , Long Chen

In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent…

Optimization and Control · Mathematics 2020-02-14 Alireza Fallah , Asuman Ozdaglar , Sarath Pattathil

We consider multi-level composite optimization problems where each mapping in the composition is the expectation over a family of random smooth mappings or the sum of some finite number of smooth mappings. We present a normalized proximal…

Optimization and Control · Mathematics 2021-05-12 Junyu Zhang , Lin Xiao

In this paper, we proposed a new technique, {\em variance controlled stochastic gradient} (VCSG), to improve the performance of the stochastic variance reduced gradient (SVRG) algorithm. To avoid over-reducing the variance of gradient by…

Machine Learning · Computer Science 2021-02-22 Jia Bi , Steve R. Gunn

This paper investigates distributed zeroth-order optimization for smooth nonconvex problems, targeting the trade-off between convergence rate and sampling cost per zeroth-order gradient estimation in current algorithms that use either the…

Optimization and Control · Mathematics 2026-04-10 Huaiyi Mu , Yujie Tang , Jie Song , Zhongkui Li

In this paper, we describe a new way to get convergence rates for optimal methods in smooth (strongly) convex optimization tasks. Our approach is based on results for tasks where gradients have nonrandom small noises. Unlike previous…

Optimization and Control · Mathematics 2020-07-14 Darina Dvinskikh , Alexander Tyurin , Alexander Gasnikov , Sergey Omelchenko

Nesterov's accelerated gradient (AG) is a popular technique to optimize objective functions comprising two components: a convex loss and a penalty function. While AG methods perform well for convex penalties, such as the LASSO, convergence…

Optimization and Control · Mathematics 2024-01-04 Kai Yang , Masoud Asgharian , Sahir Bhatnagar

With the success that the field of bilevel optimization has seen in recent years, similar methodologies have started being applied to solving more difficult applications that arise in trilevel optimization. At the helm of these applications…

Optimization and Control · Mathematics 2025-05-13 Tommaso Giovannelli , Griffin Dean Kent , Luis Nunes Vicente

Bayesian Neural Networks (BNNs) provide a promising framework for modeling predictive uncertainty and enhancing out-of-distribution robustness (OOD) by estimating the posterior distribution of network parameters. Stochastic Gradient Markov…

Machine Learning · Computer Science 2025-03-04 Hyunsu Kim , Giung Nam , Chulhee Yun , Hongseok Yang , Juho Lee

Bayesian inference tasks continue to pose a computational challenge. This especially holds for spatial-temporal modeling where high-dimensional latent parameter spaces are ubiquitous. The methodology of integrated nested Laplace…

Computation · Statistics 2023-03-28 Lisa Gaedke-Merzhäuser , Elias Krainski , Radim Janalik , Håvard Rue , Olaf Schenk

Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the…

Machine Learning · Computer Science 2017-11-23 Haiping Huang , Taro Toyoizumi

We study the convergence rate of first-order methods for rectangular matrix factorization, which is a canonical nonconvex optimization problem. Specifically, given a rank-$r$ matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$, we prove that…

Machine Learning · Computer Science 2024-12-03 Zhenghao Xu , Yuqing Wang , Tuo Zhao , Rachel Ward , Molei Tao

In this paper, a general stochastic optimization procedure is studied, unifying several variants of the stochastic gradient descent such as, among others, the stochastic heavy ball method, the Stochastic Nesterov Accelerated Gradient…

Optimization and Control · Mathematics 2021-07-13 A. Barakat , P. Bianchi , W. Hachem , Sh. Schechtman

Nesterov's well-known scheme for accelerating gradient descent in convex optimization problems is adapted to accelerating stationary iterative solvers for linear systems. Compared with classical Krylov subspace acceleration methods, the…

Optimization and Control · Mathematics 2021-08-10 Tao Hong , Irad Yavneh

In this paper, we present a stochastic augmented Lagrangian approach on (possibly infinite-dimensional) Riemannian manifolds to solve stochastic optimization problems with a finite number of deterministic constraints.We investigate the…

Optimization and Control · Mathematics 2025-04-01 Caroline Geiersbach , Tim Suchan , Kathrin Welker

Established methods for unsupervised representation learning such as variational autoencoders produce none or poorly calibrated uncertainty estimates making it difficult to evaluate if learned representations are stable and reliable. In…

Machine Learning · Computer Science 2022-08-24 Marco Miani , Frederik Warburg , Pablo Moreno-Muñoz , Nicke Skafte Detlefsen , Søren Hauberg

One of the most widely used methods for solving large-scale stochastic optimization problems is distributed asynchronous stochastic gradient descent (DASGD), a family of algorithms that result from parallelizing stochastic gradient descent…

Optimization and Control · Mathematics 2021-07-08 Zhengyuan Zhou , Panayotis Mertikopoulos , Nicholas Bambos , Peter W. Glynn , Yinyu Ye

Maximum marginal likelihood estimation (MMLE) can be formulated as the optimization of a free energy functional. From this viewpoint, the Expectation-Maximisation (EM) algorithm admits a natural interpretation as a coordinate descent method…

Machine Learning · Statistics 2026-03-10 Adam Rozzio , Rafael Athanasiades , O. Deniz Akyildiz