Related papers: Efficient Natural Gradient Descent Methods for Lar…
Optimization problem, which is aimed at finding the global minimal value of a given cost function, is one of the central problem in science and engineering. Various numerical methods have been proposed to solve this problem, among which the…
In this work, we propose Natural Hypergradient Descent (NHGD), a new method for solving bilevel optimization problems. To address the computational bottleneck in hypergradient estimation--namely, the need to compute or approximate Hessian…
Natural-gradient descent (NGD) on structured parameter spaces (e.g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations. We address this issue by using \emph{local-parameter coordinates} to…
We consider the problem of approximating a function by an element of a nonlinear manifold which admits a differentiable parametrization, typical examples being neural networks with differentiable activation functions or tensor networks.…
Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical…
Variational quantum algorithms, optimized using gradient-based methods, often exhibit sub-optimal convergence performance due to their dependence on Euclidean geometry. Quantum natural gradient descent (QNGD) is a more efficient method that…
This paper introduces a projected Sobolev natural gradient descent (NGD) method for computing ground states of the Gross-Pitaevskii equation. By projecting a continuous Riemannian Sobolev gradient flow onto the normalized neural network…
Natural gradient descent (NGD) provided deep insights and powerful tools to deep neural networks. However the computation of Fisher information matrix becomes more and more difficult as the network structure turns large and complex. This…
Natural Gradient Descent, a second-degree optimization method motivated by the information geometry, makes use of the Fisher Information Matrix instead of the Hessian which is typically used. However, in many cases, the Fisher Information…
Natural gradient descent (NGD) is a powerful optimization technique for machine learning, but the computational complexity of the inverse Fisher information matrix limits its application in training deep neural networks. To overcome this…
Parametric manifold optimization problems frequently arise in various machine learning tasks, where state functions are defined on infinite-dimensional manifolds. We propose a unified accelerated natural gradient descent (ANGD) framework to…
A commonly used heuristic in non-convex optimization is Normalized Gradient Descent (NGD) - a variant of gradient descent in which only the direction of the gradient is taken into account and its magnitude ignored. We analyze this heuristic…
Variational quantum algorithms (VQAs) are promising methods that leverage noisy quantum computers and classical computing techniques for practical applications. In VQAs, the classical optimizers such as gradient-based optimizers are…
We study the Wasserstein natural gradient in parametric statistical models with continuous sample spaces. Our approach is to pull back the $L^2$-Wasserstein metric tensor in the probability density space to a parameter space, equipping the…
Many machine learning problems can be expressed as the optimization of some cost functional over a parametric family of probability distributions. It is often beneficial to solve such optimization problems using natural gradient methods.…
We propose a gradient descent method for solving optimization problems arising in settings of tropical geometry - a variant of algebraic geometry that has attracted growing interest in applications such as computational biology, economics,…
The note considers normalized gradient descent (NGD), a natural modification of classical gradient descent (GD) in optimization problems. A serious shortcoming of GD in non-convex problems is that GD may take arbitrarily long to escape from…
Optical quantum circuits can be optimized using gradient descent methods, as the gates in a circuit can be parametrized by continuous parameters. However, the parameter space as seen by the cost function is not Euclidean, which means that…
Second-order training methods have better convergence properties than gradient descent but are rarely used in practice for large-scale training due to their computational overhead. This can be viewed as a hardware limitation (imposed by…
Natural-gradient methods markedly accelerate the training of Physics-Informed Neural Networks (PINNs), yet their Gauss--Newton update must be solved in the parameter space, incurring a prohibitive $O(n^3)$ time complexity, where $n$ is the…