English
Related papers

Related papers: Nesterov-aided Stochastic Gradient Methods using L…

200 papers

Under mild assumptions stochastic gradient methods asymptotically achieve an optimal rate of convergence if the arithmetic mean of all iterates is returned as an approximate optimal solution. However, in the absence of stochastic noise, the…

Optimization and Control · Mathematics 2022-10-06 Melinda Hagedorn , Florian Jarre

We introduce a generic scheme for accelerating first-order optimization methods in the sense of Nesterov, which builds upon a new analysis of the accelerated proximal point algorithm. Our approach consists of minimizing a convex objective…

Optimization and Control · Mathematics 2015-10-27 Hongzhou Lin , Julien Mairal , Zaid Harchaoui

We introduce a generic scheme for accelerating gradient-based optimization methods in the sense of Nesterov. The approach, called Catalyst, builds upon the inexact accelerated proximal point algorithm for minimizing a convex objective…

Machine Learning · Statistics 2018-06-20 Hongzhou Lin , Julien Mairal , Zaid Harchaoui

Block coordinate descent methods and stochastic subgradient methods have been extensively studied in optimization and machine learning. By combining randomized block sampling with stochastic subgradient methods based on dual averaging, we…

Optimization and Control · Mathematics 2015-09-16 Qi Deng , Guanghui Lan , Anand Rangarajan

Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning. A key step to tackle these problems is the efficient computation of…

Machine Learning · Statistics 2025-05-20 Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo

While many distributed optimization algorithms have been proposed for solving smooth or convex problems over the networks, few of them can handle non-convex and non-smooth problems. Based on a proximal primal-dual approach, this paper…

Optimization and Control · Mathematics 2021-09-01 Zhiguo Wang , Jiawei Zhang , Tsung-Hui Chang , Jian Li , Zhi-Quan Luo

This work proposes and analyzes a generalized acceleration technique for decreasing the computational complexity of using stochastic collocation (SC) methods to solve partial differential equations (PDEs) with random input data. The SC…

Numerical Analysis · Mathematics 2015-05-05 Diego Galindo , Peter Jantsch , Clayton G. Webster , Guannan Zhang

We derive efficient algorithms to compute weakly Pareto optimal solutions for smooth, convex and unconstrained multiobjective optimization problems in general Hilbert spaces. To this end, we define a novel inertial gradient-like dynamical…

Optimization and Control · Mathematics 2022-07-27 Konstantin Sonntag , Sebastian Peitz

Momentum based stochastic gradient methods such as heavy ball (HB) and Nesterov's accelerated gradient descent (NAG) method are widely used in practice for training deep networks and other supervised learning models, as they often provide…

Machine Learning · Computer Science 2018-08-02 Rahul Kidambi , Praneeth Netrapalli , Prateek Jain , Sham M. Kakade

We study maximum likelihood estimation for spatial generalized linear mixed models with Gaussian process approximations using a stochastic Newton-Raphson algorithm. We consider two Gaussian Process approximations in this context: spectral…

Methodology · Statistics 2026-05-19 Samuel I. Watson , Yixin Wang , Emanuele Giorgi

Gradient-based minimax optimal algorithms have greatly promoted the development of continuous optimization and machine learning. One seminal work due to Yurii Nesterov [Nes83a] established $\tilde{\mathcal{O}}(\sqrt{L/\mu})$ gradient…

Machine Learning · Computer Science 2023-12-07 Yuanshi Liu , Hanzhen Zhao , Yang Xu , Pengyun Yue , Cong Fang

Using gradient descent (GD) with fixed or decaying step-size is a standard practice in unconstrained optimization problems. However, when the loss function is only locally convex, such a step-size schedule artificially slows GD down as it…

Machine Learning · Statistics 2023-02-03 Nhat Ho , Tongzheng Ren , Sujay Sanghavi , Purnamrita Sarkar , Rachel Ward

Stochastic optimization is a cornerstone of modern machine learning. This paper studies the generalization performance of two classical stochastic optimization algorithms: stochastic gradient descent (SGD) and Nesterov's accelerated…

Machine Learning · Computer Science 2026-03-20 Shaojie Li , Pengwei Tang , Yong Liu

We introduce a generic scheme to solve nonconvex optimization problems using gradient-based algorithms originally designed for minimizing convex functions. Even though these methods may originally require convexity to operate, the proposed…

Machine Learning · Statistics 2019-01-03 Courtney Paquette , Hongzhou Lin , Dmitriy Drusvyatskiy , Julien Mairal , Zaid Harchaoui

We study the convergence of accelerated stochastic gradient descent for strongly convex objectives under the growth condition, which states that the variance of stochastic gradient is bounded by a multiplicative part that grows with the…

Optimization and Control · Mathematics 2023-11-01 You-Lin Chen , Sen Na , Mladen Kolar

We introduce a new approach to develop stochastic optimization algorithms for a class of stochastic composite and possibly nonconvex optimization problems. The main idea is to combine two stochastic estimators to create a new hybrid one. We…

Optimization and Control · Mathematics 2020-05-05 Quoc Tran-Dinh , Nhan H. Pham , Dzung T. Phan , Lam M. Nguyen

This paper studies a class of double-loop (inner-outer) algorithms for convex composite optimization. For unconstrained problems, we develop a restarted accelerated composite gradient method that attains the optimal first-order complexity…

Optimization and Control · Mathematics 2026-02-23 Matthew X. Burns , Jiaming Liang

Background: We aimed to design a Bayesian adaption trial through extensive simulations to determine values for key design parameters, demonstrate error rates, and establish the expected sample size. The complexity of the proposed outcome…

Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed…

Computation · Statistics 2020-02-10 Qifan Song , Yan Sun , Mao Ye , Faming Liang

In the stochastic gradient descent (SGD) for sequential simulations such as the neural stochastic differential equations, the Multilevel Monte Carlo (MLMC) method is known to offer better theoretical computational complexity compared to the…

Machine Learning · Computer Science 2023-10-11 Kei Ishikawa