English
Related papers

Related papers: SOFIM: Stochastic Optimization Using Regularized F…

200 papers

This paper establishes a mathematical foundation for the Adam optimizer, elucidating its connection to natural gradient descent through Riemannian and information geometry. We provide an accessible and detailed analysis of the diagonal…

Machine Learning · Computer Science 2024-09-05 Dongseong Hwang

Natural Gradient Descent, a second-degree optimization method motivated by the information geometry, makes use of the Fisher Information Matrix instead of the Hessian which is typically used. However, in many cases, the Fisher Information…

Machine Learning · Computer Science 2023-03-10 Rajesh Shrestha

Natural gradients have long been studied in deep reinforcement learning due to their fast convergence properties and covariant weight updates. However, computing natural gradients requires inversion of the Fisher Information Matrix (FIM) at…

Machine Learning · Computer Science 2026-02-12 Yingxiao Huo , Satya Prakash Dash , Radu Stoican , Samuel Kaski , Mingfei Sun

Stochastic Gradient Descent (SGD) and its momentum variants form the backbone of deep learning optimization, yet the underlying dynamics of their gradient behavior remain insufficiently understood. In this work, we reinterpret gradient…

Machine Learning · Computer Science 2026-03-09 Zhipeng Yao , Rui Yu , Guisong Chang , Ying Li , Yu Zhang , Dazhou Li

Training large-scale neural networks requires solving nonconvex optimization where the choice of optimizer fundamentally determines both convergence behavior and computational efficiency. While adaptive methods like Adam have long dominated…

Machine Learning · Computer Science 2026-01-30 Chenrui Xu , Wenjing Yan , Ying-Jun Angela Zhang

Natural gradient descent is an optimization method traditionally motivated from the perspective of information geometry, and works well for many applications as an alternative to stochastic gradient descent. In this paper we critically…

Machine Learning · Computer Science 2020-09-22 James Martens

The Quantum Fisher Information matrix (QFIM) is a central metric in promising algorithms, such as Quantum Natural Gradient Descent and Variational Quantum Imaginary Time Evolution. Computing the full QFIM for a model with $d$ parameters,…

Quantum Physics · Physics 2021-10-20 Julien Gacon , Christa Zoufal , Giuseppe Carleo , Stefan Woerner

In this work, we consider convex optimization problems with smooth objective function and nonsmooth functional constraints. We propose a new stochastic gradient algorithm, called Stochastic Halfspace Approximation Method (SHAM), to solve…

Optimization and Control · Mathematics 2024-12-04 Nitesh Kumar Singh , Ion Necoara

Our proposal is on a new stochastic optimizer for non-convex and possibly non-smooth objective functions typically defined over large dimensional design spaces. Towards this, we have tried to bridge noise-assisted global search and faster…

Machine Learning · Computer Science 2025-03-03 Uttam Suman , Mariya Mamajiwala , Mukul Saxena , Ankit Tyagi , Debasish Roy

This paper presents state estimation and stochastic optimal control gathered in one global optimization problem generating dual effect i.e. the control can improve the future estimation. As the optimal policy is impossible to compute, a…

Optimization and Control · Mathematics 2023-03-27 Emilien Flayac , Karim Dahia , Bruno Hérissé , Frédéric Jean

The Quantum Fisher Information Matrix (QFIM) plays a crucial role in quantum optimization algorithms such as Variational Quantum Imaginary Time Evolution and Quantum Natural Gradient Descent. However, computing the full QFIM incurs a…

Quantum Physics · Physics 2025-07-23 Mourad Halla

An algorithm is presented for momentum gradient descent optimization based on the first-order differential equation of the Newtonian dynamics. The fictitious mass is introduced to the dynamics of momentum for regularizing the adaptive…

Machine Learning · Computer Science 2018-05-15 Zhidong Han

In this work, we present a globalized stochastic semismooth Newton method for solving stochastic optimization problems involving smooth nonconvex and nonsmooth convex terms in the objective function. We assume that only noisy gradient and…

Optimization and Control · Mathematics 2018-03-12 Andre Milzarek , Xiantao Xiao , Shicong Cen , Zaiwen Wen , Michael Ulbrich

The Fisher information matrix (FIM) is a key quantity in statistics as it is required for example for evaluating asymptotic precisions of parameter estimates, for computing test statistics or asymptotic distributions in statistical testing,…

Methodology · Statistics 2023-02-07 Maud Delattre , Estelle Kuhn

Natural gradient descent (NGD) is a powerful optimization technique for machine learning, but the computational complexity of the inverse Fisher information matrix limits its application in training deep neural networks. To overcome this…

Machine Learning · Computer Science 2024-12-11 Weihua Liu , Said Boumaraf , Jianwu Li , Chaochao Lin , Xiabi Liu , Lijuan Niu , Naoufel Werghi

Several studies have shown the ability of natural gradient descent to minimize the objective function more efficiently than ordinary gradient descent based methods. However, the bottleneck of this approach for training deep neural networks…

Neural and Evolutionary Computing · Computer Science 2022-10-17 Abdoulaye Koroko , Ani Anciaux-Sedrakian , Ibtihel Ben Gharbia , Valérie Garès , Mounir Haddou , Quang Huy Tran

In this paper, we introduce StochGradAdam, a novel optimizer designed as an extension of the Adam algorithm, incorporating stochastic gradient sampling techniques to improve computational efficiency while maintaining robust performance.…

Machine Learning · Computer Science 2025-03-19 Juyoung Yun

The natural gradient method is widely used in statistical optimization, but its standard formulation assumes a Euclidean parameter space. This paper proposes an inversion-free stochastic natural gradient method for probability distributions…

Machine Learning · Statistics 2026-04-06 Dario Draca , Takuo Matsubara , Minh-Ngoc Tran

In deep learning, stochastic gradient descent (SGD) and its momentum-based variants are widely used for optimization. However, the internal dynamics of these methods remain underexplored. In this paper, we analyze gradient behavior through…

Machine Learning · Computer Science 2025-03-11 Zhipeng Yao , Rui Yu , Guisong Chang , Ying Li , Yu Zhang , Dazhou Li

We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data…

Machine Learning · Computer Science 2019-06-07 Yi Ren , Donald Goldfarb
‹ Prev 1 2 3 10 Next ›