Related papers: Second-Order Neural ODE Optimizer

Training Neural ODEs Using Fully Discretized Simultaneous Optimization

Neural Ordinary Differential Equations (Neural ODEs) represent continuous-time dynamics with neural networks, offering advancements for modeling and control tasks. However, training Neural ODEs requires solving differential equations at…

Machine Learning · Computer Science 2025-02-24 Mariia Shapovalova , Calvin Tsay

Scalable Second Order Optimization for Deep Learning

Optimization in machine learning, both theoretical and applied, is presently dominated by first-order gradient methods such as stochastic gradient descent. Second-order optimization methods, that involve second derivatives and/or second…

Machine Learning · Computer Science 2021-03-08 Rohan Anil , Vineet Gupta , Tomer Koren , Kevin Regan , Yoram Singer

DDPNOpt: Differential Dynamic Programming Neural Optimizer

Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we…

Machine Learning · Computer Science 2021-06-14 Guan-Horng Liu , Tianrong Chen , Evangelos A. Theodorou

A memory-efficient neural ODE framework based on high-level adjoint differentiation

Neural ordinary differential equations (neural ODEs) have emerged as a novel network architecture that bridges dynamical systems and deep learning. However, the gradient obtained with the continuous adjoint method in the vanilla neural ODE…

Machine Learning · Computer Science 2023-06-12 Hong Zhang , Wenjun Zhao

Review: Ordinary Differential Equations For Deep Learning

To better understand and improve the behavior of neural networks, a recent line of works bridged the connection between ordinary differential equations (ODEs) and deep neural networks (DNNs). The connections are made in two folds: (1) View…

Machine Learning · Computer Science 2019-11-05 Xinshi Chen

On Second Order Behaviour in Augmented Neural ODEs

Neural Ordinary Differential Equations (NODEs) are a new class of models that transform data continuously through infinite-depth architectures. The continuous nature of NODEs has made them particularly suitable for learning the dynamics of…

Machine Learning · Computer Science 2020-10-22 Alexander Norcliffe , Cristian Bodnar , Ben Day , Nikola Simidjievski , Pietro Liò

Deep Neural ODE Operator Networks for PDEs

Operator learning has emerged as a promising paradigm for developing efficient surrogate models to solve partial differential equations (PDEs). However, existing approaches often overlook the domain knowledge inherent in the underlying PDEs…

Machine Learning · Computer Science 2025-10-20 Ziqian Li , Kang Liu , Yongcun Song , Hangrui Yue , Enrique Zuazua

OCP-GN: A Scalable Second-order Optimizer for Stochastic Optimization

This paper proposes a novel second-order optimization algorithm based on the Optimal Control Principle (OCP), applicable to large-scale optimization problems in neural network training. The algorithm has a computational complexity of O(d)…

Computer Vision and Pattern Recognition · Computer Science 2026-05-12 Jindi Zhong , Congyaohui Yin , Zhaorong Zhang , Huanshui Zhang

Towards Guided Descent: Optimization Algorithms for Training Neural Networks At Scale

Neural network optimization remains one of the most consequential yet poorly understood challenges in modern AI research, where improvements in training algorithms can lead to enhanced feature learning in foundation models,…

Machine Learning · Computer Science 2025-12-23 Ansh Nagwekar

Neural Ordinary Differential Equations

We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a…

Machine Learning · Computer Science 2019-12-17 Ricky T. Q. Chen , Yulia Rubanova , Jesse Bettencourt , David Duvenaud

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Residual neural networks can be viewed as the forward Euler discretization of an Ordinary Differential Equation (ODE) with a unit time step. This has recently motivated researchers to explore other discretization approaches and train ODE…

Machine Learning · Computer Science 2019-07-02 Amir Gholami , Kurt Keutzer , George Biros

Neural Differential Equations for Learning to Program Neural Nets Through Continuous Learning Rules

Neural ordinary differential equations (ODEs) have attracted much attention as continuous-time counterparts of deep residual neural networks (NNs), and numerous extensions for recurrent NNs have been proposed. Since the 1980s, ODEs have…

Machine Learning · Computer Science 2022-10-17 Kazuki Irie , Francesco Faccio , Jürgen Schmidhuber

Neural Ordinary Differential Equations for Model Order Reduction of Stiff Systems

Neural Ordinary Differential Equations (ODEs) represent a significant advancement at the intersection of machine learning and dynamical systems, offering a continuous-time analog to discrete neural networks. Despite their promise, deploying…

Numerical Analysis · Mathematics 2025-06-18 Matteo Caldana , Jan S. Hesthaven

ResNet After All? Neural ODEs and Their Numerical Solution

A key appeal of the recently proposed Neural Ordinary Differential Equation (ODE) framework is that it seems to provide a continuous-time extension of discrete residual neural networks. As we show herein, though, trained Neural ODE models…

Machine Learning · Computer Science 2023-09-12 Katharina Ott , Prateek Katiyar , Philipp Hennig , Michael Tiemann

Optimal Control Theoretic Neural Optimizer: From Backpropagation to Dynamic Programming

Optimization of deep neural networks (DNNs) has been a driving force in the advancement of modern machine learning and artificial intelligence. With DNNs characterized by a prolonged sequence of nonlinear propagation, determining their…

Machine Learning · Computer Science 2025-10-17 Guan-Horng Liu , Tianrong Chen , Evangelos A. Theodorou

Accelerating Neural ODEs Using Model Order Reduction

Embedding nonlinear dynamical systems into artificial neural networks is a powerful new formalism for machine learning. By parameterizing ordinary differential equations (ODEs) as neural network layers, these Neural ODEs are…

Machine Learning · Computer Science 2024-10-28 Mikko Lehtimäki , Lassi Paunonen , Marja-Leena Linne

Depth-Adaptive Neural Networks from the Optimal Control viewpoint

In recent years, deep learning has been connected with optimal control as a way to define a notion of a continuous underlying learning problem. In this view, neural networks can be interpreted as a discretization of a parametric Ordinary…

Optimization and Control · Mathematics 2020-07-07 Joubine Aghili , Olga Mula

Dimer-Enhanced Optimization: A First-Order Approach to Escaping Saddle Points in Neural Network Training

First-order optimization methods, such as SGD and Adam, are widely used for training large-scale deep neural networks due to their computational efficiency and robust performance. However, relying solely on gradient information, these…

Machine Learning · Computer Science 2025-07-29 Yue Hu , Zanxia Cao , Yingchao Liu

Continuous Learned Primal Dual

Neural ordinary differential equations (Neural ODEs) propose the idea that a sequence of layers in a neural network is just a discretisation of an ODE, and thus can instead be directly modelled by a parameterised ODE. This idea has had…

Machine Learning · Computer Science 2024-05-07 Christina Runkel , Ander Biguri , Carola-Bibiane Schönlieb

Adaptive higher order reversible integrators for memory efficient deep learning

The depth of networks plays a crucial role in the effectiveness of deep learning. However, the memory requirement for backpropagation scales linearly with the number of layers, which leads to memory bottlenecks during training. Moreover,…

Numerical Analysis · Mathematics 2025-02-20 Sofya Maslovskaya , Sina Ober-Blöbaum , Christian Offen , Pranav Singh , Boris Wembe