Related papers: Accelerated Natural Gradient Method for Parametric…

AsymptoticNG: A regularized natural gradient optimization algorithm with look-ahead strategy

Optimizers that further adjust the scale of gradient, such as Adam, Natural Gradient (NG), etc., despite widely concerned and used by the community, are often found poor generalization performance, compared with Stochastic Gradient Descent…

Machine Learning · Computer Science 2021-01-19 Zedong Tang , Fenlong Jiang , Junke Song , Maoguo Gong , Hao Li , Fan Yu , Zidong Wang , Min Wang

Natural gradient descent with momentum

We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks.…

Machine Learning · Computer Science 2026-04-20 Anthony Nouy , Agustín Somacal

Generalization to the Natural Gradient Descent

Optimization problem, which is aimed at finding the global minimal value of a given cost function, is one of the central problem in science and engineering. Various numerical methods have been proposed to solve this problem, among which the…

Optimization and Control · Mathematics 2022-10-07 Shaojun Dong , Fengyu Le , Meng Zhang , Si-Jing Tao , Chao Wang , Yong-Jian Han , Guo-Ping Guo

Efficient Natural Gradient Descent Methods for Large-Scale PDE-Based Optimization Problems

We propose efficient numerical schemes for implementing the natural gradient descent (NGD) for a broad range of metric spaces with applications to PDE-based optimization problems. Our technique represents the natural gradient direction as a…

Optimization and Control · Mathematics 2023-01-12 Levon Nurbekyan , Wanzhou Lei , Yunan Yang

Accelerating Rescaled Gradient Descent: Fast Optimization of Smooth Functions

We present a family of algorithms, called descent algorithms, for optimizing convex and non-convex functions. We also introduce a new first-order algorithm, called rescaled gradient descent (RGD), and show that RGD achieves a faster…

Optimization and Control · Mathematics 2020-01-07 Ashia Wilson , Lester Mackey , Andre Wibisono

On the Curved Geometry of Accelerated Optimization

In this work we propose a differential geometric motivation for Nesterov's accelerated gradient method (AGM) for strongly-convex problems. By considering the optimization procedure as occurring on a Riemannian manifold with a natural…

Machine Learning · Computer Science 2019-11-21 Aaron Defazio

Gradient Regularized Natural Gradients

Gradient regularization (GR) has been shown to improve the generalizability of trained models. While Natural Gradient Descent has been shown to accelerate optimization in the initial phase of training, little attention has been paid to how…

Machine Learning · Computer Science 2026-03-27 Satya Prakash Dash , Hossein Abdi , Wei Pan , Samuel Kaski , Mingfei Sun

On the Convergence of Gradient Descent in GANs: MMD GAN As a Gradient Flow

We consider the maximum mean discrepancy ($\mathrm{MMD}$) GAN problem and propose a parametric kernelized gradient flow that mimics the min-max game in gradient regularized $\mathrm{MMD}$ GAN. We show that this flow provides a descent…

Machine Learning · Computer Science 2020-11-05 Youssef Mroueh , Truyen Nguyen

Towards Riemannian Accelerated Gradient Methods

We propose a Riemannian version of Nesterov's Accelerated Gradient algorithm (RAGD), and show that for geodesically smooth and strongly convex problems, within a neighborhood of the minimizer whose radius depends on the condition number as…

Optimization and Control · Mathematics 2018-06-08 Hongyi Zhang , Suvrit Sra

Accelerated Extra-Gradient Descent: A Novel Accelerated First-Order Method

We provide a novel accelerated first-order method that achieves the asymptotically optimal convergence rate for smooth functions in the first-order oracle model. To this day, Nesterov's Accelerated Gradient Descent (AGD) and variations…

Optimization and Control · Mathematics 2018-02-13 Jelena Diakonikolas , Lorenzo Orecchia

Adaptive Gradient Regularization: A Faster and Generalizable Optimization Technique for Deep Neural Networks

Stochastic optimization plays a crucial role in the advancement of deep learning technologies. Over the decades, significant effort has been dedicated to improving the training efficiency and robustness of deep neural networks, via various…

Machine Learning · Computer Science 2024-08-21 Huixiu Jiang , Ling Yang , Yu Bao , Rutong Si , Sikun Yang

ONG: Orthogonal Natural Gradient Descent

Orthogonal Gradient Descent (OGD) has emerged as a powerful method for continual learning. However, its Euclidean projections do not leverage the underlying information-geometric structure of the problem, which can lead to suboptimal…

Machine Learning · Computer Science 2025-12-09 Yajat Yadav , Patrick Mendoza , Jathin Korrapati

Accelerating Natural Gradient Descent for PINNs with Randomized Numerical Linear Algebra

Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical…

Numerical Analysis · Mathematics 2026-05-28 Ivan Bioli , Carlo Marcati , Giancarlo Sangalli

Accelerated Stochastic Gradient Descent for Minimizing Finite Sums

We propose an optimization method for minimizing the finite sums of smooth convex functions. Our method incorporates an accelerated gradient descent (AGD) and a stochastic variance reduction gradient (SVRG) in a mini-batch setting. Unlike…

Machine Learning · Statistics 2015-06-11 Atsushi Nitanda

Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

While Nesterov's Accelerated Gradient Descent (AGD) efficiently solves constrained problems when the constraint set $X \subseteq \mathbb{R}^n$ is simple and easy to project onto, it remains an open question whether function-constrained…

Optimization and Control · Mathematics 2025-12-02 Zhe Zhang , Guanghui Lan

Natural Hypergradient Descent: Algorithm Design, Convergence Analysis, and Parallel Implementation

In this work, we propose Natural Hypergradient Descent (NHGD), a new method for solving bilevel optimization problems. To address the computational bottleneck in hypergradient estimation--namely, the need to compute or approximate Hessian…

Machine Learning · Computer Science 2026-04-02 Deyi Kong , Zaiwei Chen , Shuzhong Zhang , Shancong Mou

A New Perspective of Accelerated Gradient Methods: The Controlled Invariant Manifold Approach

Gradient Descent (GD) is a ubiquitous algorithm for finding the optimal solution to an optimization problem. For reduced computational complexity, the optimal solution $\mathrm{x^*}$ of the optimization problem must be attained in a minimum…

Optimization and Control · Mathematics 2023-06-01 Revati Gunjal , Sushama Wagh , Syed Shadab Nayyer , Alex Stankovic , Navdeep M. Singh

On the analysis of optimization with fixed-rank matrices: a quotient geometric view

We study a type of Riemannian gradient descent (RGD) algorithm, designed through Riemannian preconditioning, for optimization on $\mathcal{M}_k^{m\times n}$ -- the set of $m\times n$ real matrices with a fixed rank $k$. Our analysis is…

Optimization and Control · Mathematics 2024-08-15 Shuyu Dong , Bin Gao , Wen Huang , Kyle A. Gallivan

A Note on the Gradient-Evaluation Sequence in Accelerated Gradient Methods

Nesterov's accelerated gradient descent method (AGD) is a seminal deterministic first-order method known to achieve the optimal order of iteration complexity for solving convex smooth optimization problems. Two distinct sequences of…

Optimization and Control · Mathematics 2026-03-10 Yan Wu , Yipeng Zhang , Lu Liu , Yuyuan Ouyang

AutoGD: Automatic Learning Rate Selection for Gradient Descent

The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate…

Machine Learning · Computer Science 2025-10-14 Nikola Surjanovic , Alexandre Bouchard-Côté , Trevor Campbell