Related papers: SOFIM: Stochastic Optimization Using Regularized F…

FAdam: Adam is a natural gradient optimizer using diagonal empirical Fisher information

This paper establishes a mathematical foundation for the Adam optimizer, elucidating its connection to natural gradient descent through Riemannian and information geometry. We provide an accessible and detailed analysis of the diagonal…

Machine Learning · Computer Science 2024-09-05 Dongseong Hwang

Natural Gradient Methods: Perspectives, Efficient-Scalable Approximations, and Analysis

Natural Gradient Descent, a second-degree optimization method motivated by the information geometry, makes use of the Fisher Information Matrix instead of the Hessian which is typically used. However, in many cases, the Fisher Information…

Machine Learning · Computer Science 2023-03-10 Rajesh Shrestha

Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning

Natural gradients have long been studied in deep reinforcement learning due to their fast convergence properties and covariant weight updates. However, computing natural gradients requires inversion of the Fisher Information Matrix (FIM) at…

Machine Learning · Computer Science 2026-02-12 Yingxiao Huo , Satya Prakash Dash , Radu Stoican , Samuel Kaski , Mingfei Sun

Dynamic Momentum Recalibration in Online Gradient Learning

Stochastic Gradient Descent (SGD) and its momentum variants form the backbone of deep learning optimization, yet the underlying dynamics of their gradient behavior remain insufficiently understood. In this work, we reinterpret gradient…

Machine Learning · Computer Science 2026-03-09 Zhipeng Yao , Rui Yu , Guisong Chang , Ying Li , Yu Zhang , Dazhou Li

FISMO: Fisher-Structured Momentum-Orthogonalized Optimizer

Training large-scale neural networks requires solving nonconvex optimization where the choice of optimizer fundamentally determines both convergence behavior and computational efficiency. While adaptive methods like Adam have long dominated…

Machine Learning · Computer Science 2026-01-30 Chenrui Xu , Wenjing Yan , Ying-Jun Angela Zhang

New insights and perspectives on the natural gradient method

Natural gradient descent is an optimization method traditionally motivated from the perspective of information geometry, and works well for many applications as an alternative to stochastic gradient descent. In this paper we critically…

Machine Learning · Computer Science 2020-09-22 James Martens

Simultaneous Perturbation Stochastic Approximation of the Quantum Fisher Information

The Quantum Fisher Information matrix (QFIM) is a central metric in promising algorithms, such as Quantum Natural Gradient Descent and Variational Quantum Imaginary Time Evolution. Computing the full QFIM for a model with $d$ parameters,…

Quantum Physics · Physics 2021-10-20 Julien Gacon , Christa Zoufal , Giuseppe Carleo , Stefan Woerner

Stochastic halfspace approximation method for convex optimization with nonsmooth functional constraints

In this work, we consider convex optimization problems with smooth objective function and nonsmooth functional constraints. We propose a new stochastic gradient algorithm, called Stochastic Halfspace Approximation Method (SHAM), to solve…

Optimization and Control · Mathematics 2024-12-04 Nitesh Kumar Singh , Ion Necoara

FINDER: Stochastic Mirroring of Noisy Quasi-Newton Search and Deep Network Training

Our proposal is on a new stochastic optimizer for non-convex and possibly non-smooth objective functions typically defined over large dimensional design spaces. Towards this, we have tried to bridge noise-assisted global search and faster…

Machine Learning · Computer Science 2025-03-03 Uttam Suman , Mariya Mamajiwala , Mukul Saxena , Ankit Tyagi , Debasish Roy

Nonlinear Fisher Particle Output Feedback Control and its application to Terrain Aided Navigation

This paper presents state estimation and stochastic optimal control gathered in one global optimization problem generating dual effect i.e. the control can improve the future estimation. As the optimal policy is impossible to compute, a…

Optimization and Control · Mathematics 2023-03-27 Emilien Flayac , Karim Dahia , Bruno Hérissé , Frédéric Jean

Estimation of Quantum Fisher Information via Stein's Identity in Variational Quantum Algorithms

The Quantum Fisher Information Matrix (QFIM) plays a crucial role in quantum optimization algorithms such as Variational Quantum Imaginary Time Evolution and Quantum Natural Gradient Descent. However, computing the full QFIM incurs a…

Quantum Physics · Physics 2025-07-23 Mourad Halla

Dyna: A Method of Momentum for Stochastic Optimization

An algorithm is presented for momentum gradient descent optimization based on the first-order differential equation of the Newtonian dynamics. The fictitious mass is introduced to the dynamics of momentum for regularizing the adaptive…

Machine Learning · Computer Science 2018-05-15 Zhidong Han

A Stochastic Semismooth Newton Method for Nonsmooth Nonconvex Optimization

In this work, we present a globalized stochastic semismooth Newton method for solving stochastic optimization problems involving smooth nonconvex and nonsmooth convex terms in the objective function. We assume that only noisy gradient and…

Optimization and Control · Mathematics 2018-03-12 Andre Milzarek , Xiantao Xiao , Shicong Cen , Zaiwen Wen , Michael Ulbrich

Estimating Fisher Information Matrix in Latent Variable Models based on the Score Function

The Fisher information matrix (FIM) is a key quantity in statistics as it is required for example for evaluating asymptotic precisions of parameter estimates, for computing test statistics or asymptotic distributions in statistical testing,…

Methodology · Statistics 2023-02-07 Maud Delattre , Estelle Kuhn

Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent

Natural gradient descent (NGD) is a powerful optimization technique for machine learning, but the computational complexity of the inverse Fisher information matrix limits its application in training deep neural networks. To overcome this…

Machine Learning · Computer Science 2024-12-11 Weihua Liu , Said Boumaraf , Jianwu Li , Chaochao Lin , Xiabi Liu , Lijuan Niu , Naoufel Werghi

Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition

Several studies have shown the ability of natural gradient descent to minimize the objective function more efficiently than ordinary gradient descent based methods. However, the bottleneck of this approach for training deep neural networks…

Neural and Evolutionary Computing · Computer Science 2022-10-17 Abdoulaye Koroko , Ani Anciaux-Sedrakian , Ibtihel Ben Gharbia , Valérie Garès , Mounir Haddou , Quang Huy Tran

Stochastic Gradient Sampling for Enhancing Neural Networks Training

In this paper, we introduce StochGradAdam, a novel optimizer designed as an extension of the Adam algorithm, incorporating stochastic gradient sampling techniques to improve computational efficiency while maintaining robust performance.…

Machine Learning · Computer Science 2025-03-19 Juyoung Yun

Inversion-Free Natural Gradient Descent on Riemannian Manifolds

The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions…

Machine Learning · Statistics 2026-04-06 Dario Draca , Takuo Matsubara , Minh-Ngoc Tran

Signal Processing Meets SGD: From Momentum to Filter

In deep learning, stochastic gradient descent (SGD) and its momentum-based variants are widely used for optimization. However, the internal dynamics of these methods remain underexplored. In this paper, we analyze gradient behavior through…

Machine Learning · Computer Science 2025-03-11 Zhipeng Yao , Rui Yu , Guisong Chang , Ying Li , Yu Zhang , Dazhou Li

Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks

We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data…

Machine Learning · Computer Science 2019-06-07 Yi Ren , Donald Goldfarb