Related papers: Nesterov-aided Stochastic Gradient Methods using L…

Optimized convergence of stochastic gradient descent by weighted averaging

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the…

Optimization and Control · Mathematics 2022-10-06 Melinda Hagedorn , Florian Jarre

A Universal Catalyst for First-Order Optimization

We introduce a generic scheme for accelerating first-order optimization methods in the sense of Nesterov, which builds upon a new analysis of the accelerated proximal point algorithm. Our approach consists of minimizing a convex objective…

Optimization and Control · Mathematics 2015-10-27 Hongzhou Lin , Julien Mairal , Zaid Harchaoui

Catalyst Acceleration for First-order Convex Optimization: from Theory to Practice

We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective…

Machine Learning · Statistics 2018-06-20 Hongzhou Lin , Julien Mairal , Zaid Harchaoui

Randomized Block Subgradient Methods for Convex Nonsmooth and Stochastic Optimization

Block coordinate descent methods and stochastic subgradient methods have been extensively studied in optimization and machine learning. By combining randomized block sampling with stochastic subgradient methods based on dual averaging, we…

Optimization and Control · Mathematics 2015-09-16 Qi Deng , Guanghui Lan , Anand Rangarajan

Convergence Properties of Stochastic Hypergradients

Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning. A key step to tackle these problems is the efficient computation of…

Machine Learning · Statistics 2025-05-20 Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo

Distributed Stochastic Consensus Optimization with Momentum for Nonconvex Nonsmooth Problems

While many distributed optimization algorithms have been proposed for solving smooth or convex problems over the networks, few of them can handle non-convex and non-smooth problems. Based on a proximal primal-dual approach, this paper…

Optimization and Control · Mathematics 2021-09-01 Zhiguo Wang , Jiawei Zhang , Tsung-Hui Chang , Jian Li , Zhi-Quan Luo

Accelerating stochastic collocation methods for partial differential equations with random input data

This work proposes and analyzes a generalized acceleration technique for decreasing the computational complexity of using stochastic collocation (SC) methods to solve partial differential equations (PDEs) with random input data. The SC…

Numerical Analysis · Mathematics 2015-05-05 Diego Galindo , Peter Jantsch , Clayton G. Webster , Guannan Zhang

Fast Multiobjective Gradient Methods with Nesterov Acceleration via Inertial Gradient-like Systems

We derive efficient algorithms to compute weakly Pareto optimal solutions for smooth, convex and unconstrained multiobjective optimization problems in general Hilbert spaces. To this end, we define a novel inertial gradient-like dynamical…

Optimization and Control · Mathematics 2022-07-27 Konstantin Sonntag , Sebastian Peitz

On the insufficiency of existing momentum schemes for Stochastic Optimization

Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov's accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide…

Machine Learning · Computer Science 2018-08-02 Rahul Kidambi , Praneeth Netrapalli , Prateek Jain , Sham M. Kakade

Approximate Likelihood-Based Inference for Spatial Generalized Linear Mixed Models

We study maximum likelihood estimation for spatial generalized linear mixed models with Gaussian process approximations using a stochastic Newton-Raphson algorithm. We consider two Gaussian Process approximations in this context: spectral…

Methodology · Statistics 2026-05-19 Samuel I. Watson , Yixin Wang , Emanuele Giorgi

Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/\mu})$ gradient…

Machine Learning · Computer Science 2023-12-07 Yuanshi Liu , Hanzhen Zhao , Yang Xu , Pengyun Yue , Cong Fang

An Exponentially Increasing Step-size for Parameter Estimation in Statistical Models

Using gradient descent (GD) with fixed or decaying step-size is a standard practice in unconstrained optimization problems. However, when the loss function is only locally convex, such a step-size schedule artificially slows GD down as it…

Machine Learning · Statistics 2023-02-03 Nhat Ho , Tongzheng Ren , Sujay Sanghavi , Purnamrita Sarkar , Rachel Ward

Improved Learning Rates for Stochastic Optimization

Stochastic optimization is a cornerstone of modern machine learning. This paper studies the generalization performance of two classical stochastic optimization algorithms: stochastic gradient descent (SGD) and Nesterov's accelerated…

Machine Learning · Computer Science 2026-03-20 Shaojie Li , Pengwei Tang , Yong Liu

Catalyst Acceleration for Gradient-Based Non-Convex Optimization

We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed…

Machine Learning · Statistics 2019-01-03 Courtney Paquette , Hongzhou Lin , Dmitriy Drusvyatskiy , Julien Mairal , Zaid Harchaoui

Convergence Analysis of Accelerated Stochastic Gradient Descent under the Growth Condition

We study the convergence of accelerated stochastic gradient descent for strongly convex objectives under the growth condition, which states that the variance of stochastic gradient is bounded by a multiplicative part that grows with the…

Optimization and Control · Mathematics 2023-11-01 You-Lin Chen , Sen Na , Mladen Kolar

A Hybrid Stochastic Optimization Framework for Stochastic Composite Nonconvex Optimization

We introduce a new approach to develop stochastic optimization algorithms for a class of stochastic composite and possibly nonconvex optimization problems. The main idea is to combine two stochastic estimators to create a new hybrid one. We…

Optimization and Control · Mathematics 2020-05-05 Quoc Tran-Dinh , Nhan H. Pham , Dzung T. Phan , Lam M. Nguyen

Improved Analysis of Restarted Accelerated Gradient and Augmented Lagrangian Methods via Inexact Proximal Point Frameworks

This paper studies a class of double-loop (inner-outer) algorithms for convex composite optimization. For unconstrained problems, we develop a restarted accelerated composite gradient method that attains the optimal first-order complexity…

Optimization and Control · Mathematics 2026-02-23 Matthew X. Burns , Jiaming Liang

Designing a Bayesian adaptive clinical trial to evaluate novel mechanical ventilation strategies in acute respiratory failure using Integrated Nested Laplace Approximations

Background: We aimed to design a Bayesian adaption trial through extensive simulations to determine values for key design parameters, demonstrate error rates, and establish the expected sample size. The complexity of the proposed outcome…

Applications · Statistics 2023-04-03 Reyhaneh Hosseini , Ziming Chen , Ewan Goligher , Eddy Fan , Niall D. Ferguson , Michael O. Harhay , Sarina Sahetya , Martin Urner , Christopher J. Yarnell , Anna Heath

Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection

Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed…

Computation · Statistics 2020-02-10 Qifan Song , Yan Sun , Mao Ye , Faming Liang

On the Parallel Complexity of Multilevel Monte Carlo in Stochastic Gradient Descent

In the stochastic gradient descent (SGD) for sequential simulations such as the neural stochastic differential equations, the Multilevel Monte Carlo (MLMC) method is known to offer better theoretical computational complexity compared to the…

Machine Learning · Computer Science 2023-10-11 Kei Ishikawa