Related papers: Adaptive higher order reversible integrators for m…

Efficient, Accurate and Stable Gradients for Neural ODEs

Training Neural ODEs requires backpropagating through an ODE solve. The state-of-the-art backpropagation method is recursive checkpointing that balances recomputation with memory cost. Here, we introduce a class of algebraically reversible…

Machine Learning · Computer Science 2025-01-30 Sam McCallum , James Foster

Reversible Architectures for Arbitrarily Deep Residual Neural Networks

Recently, deep residual networks have been successfully applied in many computer vision and natural language processing tasks, pushing the state-of-the-art performance with deeper and wider architectures. In this work, we interpret deep…

Computer Vision and Pattern Recognition · Computer Science 2017-11-21 Bo Chang , Lili Meng , Eldad Haber , Lars Ruthotto , David Begert , Elliot Holtham

Momentum Residual Neural Networks

The training of deep residual neural networks (ResNets) with backpropagation has a memory cost that increases linearly with respect to the depth of the network. A way to circumvent this issue is to use reversible architectures. In this…

Machine Learning · Computer Science 2021-07-23 Michael E. Sander , Pierre Ablin , Mathieu Blondel , Gabriel Peyré

Reversible Recurrent Neural Networks

Recurrent neural networks (RNNs) provide state-of-the-art performance in processing sequential data but are memory intensive to train, limiting the flexibility of RNN models which can be trained. Reversible RNNs---RNNs for which the…

Machine Learning · Computer Science 2018-10-26 Matthew MacKay , Paul Vicol , Jimmy Ba , Roger Grosse

Adaptive Feedforward Gradient Estimation in Neural ODEs

Neural Ordinary Differential Equations (Neural ODEs) represent a significant breakthrough in deep learning, promising to bridge the gap between machine learning and the rich theoretical frameworks developed in various mathematical fields…

Machine Learning · Computer Science 2024-09-24 Jaouad Dabounou

Accelerating Neural ODEs Using Model Order Reduction

Embedding nonlinear dynamical systems into artificial neural networks is a powerful new formalism for machine learning. By parameterizing ordinary differential equations (ODEs) as neural network layers, these Neural ODEs are…

Machine Learning · Computer Science 2024-10-28 Mikko Lehtimäki , Lassi Paunonen , Marja-Leena Linne

Understanding the Principles of Recursive Neural networks: A Generative Approach to Tackle Model Complexity

Recursive Neural Networks are non-linear adaptive models that are able to learn deep structured information. However, these models have not yet been broadly accepted. This fact is mainly due to its inherent complexity. In particular, not…

Neural and Evolutionary Computing · Computer Science 2009-11-18 Alejandro Chinea

Reversible designs for extreme memory cost reduction of CNN training

Training Convolutional Neural Networks (CNN) is a resource intensive task that requires specialized hardware for efficient computation. One of the most limiting bottleneck of CNN training is the memory cost associated with storing the…

Computer Vision and Pattern Recognition · Computer Science 2019-10-25 Tristan Hascoet , Quentin Febvre , Yasuo Ariki , Tetsuya Takiguchi

Layer-Specific Adaptive Learning Rates for Deep Networks

The increasing complexity of deep learning architectures is resulting in training time requiring weeks or even months. This slow training is due in part to vanishing gradients, in which the gradients used by back-propagation are extremely…

Computer Vision and Pattern Recognition · Computer Science 2015-10-16 Bharat Singh , Soham De , Yangmuzi Zhang , Thomas Goldstein , Gavin Taylor

A memory-efficient neural ODE framework based on high-level adjoint differentiation

Neural ordinary differential equations (neural ODEs) have emerged as a novel network architecture that bridges dynamical systems and deep learning. However, the gradient obtained with the continuous adjoint method in the vanilla neural ODE…

Machine Learning · Computer Science 2023-06-12 Hong Zhang , Wenjun Zhao

Layer-wise Adaptive Step-Sizes for Stochastic First-Order Methods for Deep Learning

We propose a new per-layer adaptive step-size procedure for stochastic first-order optimization methods for minimizing empirical loss functions in deep learning, eliminating the need for the user to tune the learning rate (LR). The proposed…

Machine Learning · Computer Science 2023-07-07 Achraf Bahamou , Donald Goldfarb

m-RevNet: Deep Reversible Neural Networks with Momentum

In recent years, the connections between deep residual networks and first-order Ordinary Differential Equations (ODEs) have been disclosed. In this work, we further bridge the deep neural architecture design with the second-order ODEs and…

Computer Vision and Pattern Recognition · Computer Science 2021-08-17 Duo Li , Shang-Hua Gao

Second-Order Neural ODE Optimizer

We propose a novel second-order optimization framework for training the emerging deep continuous-time models, specifically the Neural Ordinary Differential Equations (Neural ODEs). Since their training already involves expensive gradient…

Machine Learning · Computer Science 2021-11-09 Guan-Horng Liu , Tianrong Chen , Evangelos A. Theodorou

Imbedding Deep Neural Networks

Continuous-depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. The common solution is to use the adjoint sensitivity…

Machine Learning · Computer Science 2022-02-16 Andrew Corbett , Dmitry Kangin

Deep Recurrent Neural Networks for Time Series Prediction

Ability of deep networks to extract high level features and of recurrent networks to perform time-series inference have been studied. In view of universality of one hidden layer network at approximating functions under weak constraints, the…

Neural and Evolutionary Computing · Computer Science 2014-12-19 Sharat C. Prasad , Piyush Prasad

Neural Ordinary Differential Equations for Model Order Reduction of Stiff Systems

Neural Ordinary Differential Equations (ODEs) represent a significant advancement at the intersection of machine learning and dynamical systems, offering a continuous-time analog to discrete neural networks. Despite their promise, deploying…

Numerical Analysis · Mathematics 2025-06-18 Matteo Caldana , Jan S. Hesthaven

Memory-efficient Learning for Large-scale Computational Imaging -- NeurIPS deep inverse workshop

Computational imaging systems jointly design computation and hardware to retrieve information which is not traditionally accessible with standard imaging systems. Recently, critical aspects such as experimental design and image priors are…

Image and Video Processing · Electrical Eng. & Systems 2020-03-13 Michael Kellman , Jon Tamir , Emrah Boston , Michael Lustig , Laura Waller

Neural Operator Learning for Long-Time Integration in Dynamical Systems with Recurrent Neural Networks

Deep neural networks are an attractive alternative for simulating complex dynamical systems, as in comparison to traditional scientific computing methods, they offer reduced computational costs during inference and can be trained directly…

Machine Learning · Computer Science 2024-05-01 Katarzyna Michałowska , Somdatta Goswami , George Em Karniadakis , Signe Riemer-Sørensen

Hierarchical Deep Learning of Multiscale Differential Equation Time-Steppers

Nonlinear differential equations rarely admit closed-form solutions, thus requiring numerical time-stepping algorithms to approximate solutions. Further, many systems characterized by multiscale physics exhibit dynamics over a vast range of…

Machine Learning · Computer Science 2020-08-26 Yuying Liu , J. Nathan Kutz , Steven L. Brunton

ANODE: Unconditionally Accurate Memory-Efficient Gradients for Neural ODEs

Residual neural networks can be viewed as the forward Euler discretization of an Ordinary Differential Equation (ODE) with a unit time step. This has recently motivated researchers to explore other discretization approaches and train ODE…

Machine Learning · Computer Science 2019-07-02 Amir Gholami , Kurt Keutzer , George Biros