English
Related papers

Related papers: Accelerated Natural Gradient Method for Parametric…

200 papers

Optimizers that further adjust the scale of gradient, such as Adam, Natural Gradient (NG), etc., despite widely concerned and used by the community, are often found poor generalization performance, compared with Stochastic Gradient Descent…

Machine Learning · Computer Science 2021-01-19 Zedong Tang , Fenlong Jiang , Junke Song , Maoguo Gong , Hao Li , Fan Yu , Zidong Wang , Min Wang

We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks.…

Machine Learning · Computer Science 2026-04-20 Anthony Nouy , Agustín Somacal

Optimization problem, which is aimed at finding the global minimal value of a given cost function, is one of the central problem in science and engineering. Various numerical methods have been proposed to solve this problem, among which the…

Optimization and Control · Mathematics 2022-10-07 Shaojun Dong , Fengyu Le , Meng Zhang , Si-Jing Tao , Chao Wang , Yong-Jian Han , Guo-Ping Guo

We propose efficient numerical schemes for implementing the natural gradient descent (NGD) for a broad range of metric spaces with applications to PDE-based optimization problems. Our technique represents the natural gradient direction as a…

Optimization and Control · Mathematics 2023-01-12 Levon Nurbekyan , Wanzhou Lei , Yunan Yang

We present a family of algorithms, called descent algorithms, for optimizing convex and non-convex functions. We also introduce a new first-order algorithm, called rescaled gradient descent (RGD), and show that RGD achieves a faster…

Optimization and Control · Mathematics 2020-01-07 Ashia Wilson , Lester Mackey , Andre Wibisono

In this work we propose a differential geometric motivation for Nesterov's accelerated gradient method (AGM) for strongly-convex problems. By considering the optimization procedure as occurring on a Riemannian manifold with a natural…

Machine Learning · Computer Science 2019-11-21 Aaron Defazio

Gradient regularization (GR) has been shown to improve the generalizability of trained models. While Natural Gradient Descent has been shown to accelerate optimization in the initial phase of training, little attention has been paid to how…

Machine Learning · Computer Science 2026-03-27 Satya Prakash Dash , Hossein Abdi , Wei Pan , Samuel Kaski , Mingfei Sun

We consider the maximum mean discrepancy ($\mathrm{MMD}$) GAN problem and propose a parametric kernelized gradient flow that mimics the min-max game in gradient regularized $\mathrm{MMD}$ GAN. We show that this flow provides a descent…

Machine Learning · Computer Science 2020-11-05 Youssef Mroueh , Truyen Nguyen

We propose a Riemannian version of Nesterov's Accelerated Gradient algorithm (RAGD), and show that for geodesically smooth and strongly convex problems, within a neighborhood of the minimizer whose radius depends on the condition number as…

Optimization and Control · Mathematics 2018-06-08 Hongyi Zhang , Suvrit Sra

We provide a novel accelerated first-order method that achieves the asymptotically optimal convergence rate for smooth functions in the first-order oracle model. To this day, Nesterov's Accelerated Gradient Descent (AGD) and variations…

Optimization and Control · Mathematics 2018-02-13 Jelena Diakonikolas , Lorenzo Orecchia

Stochastic optimization plays a crucial role in the advancement of deep learning technologies. Over the decades, significant effort has been dedicated to improving the training efficiency and robustness of deep neural networks, via various…

Machine Learning · Computer Science 2024-08-21 Huixiu Jiang , Ling Yang , Yu Bao , Rutong Si , Sikun Yang

Orthogonal Gradient Descent (OGD) has emerged as a powerful method for continual learning. However, its Euclidean projections do not leverage the underlying information-geometric structure of the problem, which can lead to suboptimal…

Machine Learning · Computer Science 2025-12-09 Yajat Yadav , Patrick Mendoza , Jathin Korrapati

Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical…

Numerical Analysis · Mathematics 2026-05-28 Ivan Bioli , Carlo Marcati , Giancarlo Sangalli

We propose an optimization method for minimizing the finite sums of smooth convex functions. Our method incorporates an accelerated gradient descent (AGD) and a stochastic variance reduction gradient (SVRG) in a mini-batch setting. Unlike…

Machine Learning · Statistics 2015-06-11 Atsushi Nitanda

While Nesterov's Accelerated Gradient Descent (AGD) efficiently solves constrained problems when the constraint set $X \subseteq \mathbb{R}^n$ is simple and easy to project onto, it remains an open question whether function-constrained…

Optimization and Control · Mathematics 2025-12-02 Zhe Zhang , Guanghui Lan

In this work, we propose Natural Hypergradient Descent (NHGD), a new method for solving bilevel optimization problems. To address the computational bottleneck in hypergradient estimation--namely, the need to compute or approximate Hessian…

Machine Learning · Computer Science 2026-04-02 Deyi Kong , Zaiwei Chen , Shuzhong Zhang , Shancong Mou

Gradient Descent (GD) is a ubiquitous algorithm for finding the optimal solution to an optimization problem. For reduced computational complexity, the optimal solution $\mathrm{x^*}$ of the optimization problem must be attained in a minimum…

Optimization and Control · Mathematics 2023-06-01 Revati Gunjal , Sushama Wagh , Syed Shadab Nayyer , Alex Stankovic , Navdeep M. Singh

We study a type of Riemannian gradient descent (RGD) algorithm, designed through Riemannian preconditioning, for optimization on $\mathcal{M}_k^{m\times n}$ -- the set of $m\times n$ real matrices with a fixed rank $k$. Our analysis is…

Optimization and Control · Mathematics 2024-08-15 Shuyu Dong , Bin Gao , Wen Huang , Kyle A. Gallivan

Nesterov's accelerated gradient descent method (AGD) is a seminal deterministic first-order method known to achieve the optimal order of iteration complexity for solving convex smooth optimization problems. Two distinct sequences of…

Optimization and Control · Mathematics 2026-03-10 Yan Wu , Yipeng Zhang , Lu Liu , Yuyuan Ouyang

The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate…

Machine Learning · Computer Science 2025-10-14 Nikola Surjanovic , Alexandre Bouchard-Côté , Trevor Campbell
‹ Prev 1 2 3 10 Next ›