Related papers: Memory-Efficient Backpropagation through Large Lin…

Memory-efficient Learning for Large-scale Computational Imaging

Critical aspects of computational imaging systems, such as experimental design and image priors, can be optimized through deep networks formed by the unrolled iterations of classical model-based reconstructions (termed physics-based…

Computer Vision and Pattern Recognition · Computer Science 2020-03-13 Michael Kellman , Kevin Zhang , Jon Tamir , Emrah Bostan , Michael Lustig , Laura Waller

Towards Scalable Backpropagation-Free Gradient Estimation

While backpropagation--reverse-mode automatic differentiation--has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network and the storage of intermediate activations.…

Machine Learning · Computer Science 2025-11-06 Daniel Wang , Evan Markou , Dylan Campbell

Backprop with Approximate Activations for Memory-efficient Network Training

Training convolutional neural network models is memory intensive since back-propagation requires storing activations of all intermediate layers. This presents a practical concern when seeking to deploy very deep architectures in production,…

Machine Learning · Computer Science 2019-10-30 Ayan Chakrabarti , Benjamin Moseley

Memorized Sparse Backpropagation

Neural network learning is usually time-consuming since backpropagation needs to compute full gradients and backpropagate them across multiple layers. Despite its success of existing works in accelerating propagation through sparseness, the…

Machine Learning · Computer Science 2020-10-28 Zhiyuan Zhang , Pengcheng Yang , Xuancheng Ren , Qi Su , Xu Sun

Local Pairwise Distance Matching for Backpropagation-Free Reinforcement Learning

Training neural networks with reinforcement learning (RL) typically relies on backpropagation (BP), necessitating storage of activations from the forward pass for subsequent backward updates. Furthermore, backpropagating error signals…

Machine Learning · Computer Science 2025-07-16 Daniel Tanneberg

Beyond Backpropagation: Optimization with Multi-Tangent Forward Gradients

The gradients used to train neural networks are typically computed using backpropagation. While an efficient way to obtain exact gradients, backpropagation is computationally expensive, hinders parallelization, and is biologically…

Machine Learning · Computer Science 2026-01-14 Katharina Flügel , Daniel Coquelin , Marie Weiel , Charlotte Debus , Achim Streit , Markus Götz

A Gentle Tutorial of Recurrent Neural Network with Error Backpropagation

We describe recurrent neural networks (RNNs), which have attracted great attention on sequential tasks, such as handwriting recognition, speech recognition and image to text. However, compared to general feedforward neural networks, RNNs…

Machine Learning · Computer Science 2018-01-16 Gang Chen

Faster Biological Gradient Descent Learning

Back-propagation is a popular machine learning algorithm that uses gradient descent in training neural networks for supervised learning, but can be very slow. A number of algorithms have been developed to speed up convergence and improve…

Neural and Evolutionary Computing · Computer Science 2020-09-29 Ho Ling Li

Using Linear Regression for Iteratively Training Neural Networks

We present a simple linear regression based approach for learning the weights and biases of a neural network, as an alternative to standard gradient based backpropagation. The present work is exploratory in nature, and we restrict the…

Machine Learning · Computer Science 2023-07-17 Harshad Khadilkar

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

How can neural networks be trained on large-volume temporal data efficiently? To compute the gradients required to update parameters, backpropagation blocks computations until the forward and backward passes are completed. For temporal…

Computer Vision and Pattern Recognition · Computer Science 2021-07-13 Mateusz Malinowski , Dimitrios Vytiniotis , Grzegorz Swirszcz , Viorica Patraucean , Joao Carreira

Memory-efficient Learning for Large-scale Computational Imaging -- NeurIPS deep inverse workshop

Computational imaging systems jointly design computation and hardware to retrieve information which is not traditionally accessible with standard imaging systems. Recently, critical aspects such as experimental design and image priors are…

Image and Video Processing · Electrical Eng. & Systems 2020-03-13 Michael Kellman , Jon Tamir , Emrah Boston , Michael Lustig , Laura Waller

Learning Longer Memory in Recurrent Neural Networks

Recurrent neural network is a powerful model that learns temporal patterns in sequential data. For a long time, it was believed that recurrent networks are difficult to train using simple optimizers, such as stochastic gradient descent, due…

Neural and Evolutionary Computing · Computer Science 2015-04-20 Tomas Mikolov , Armand Joulin , Sumit Chopra , Michael Mathieu , Marc'Aurelio Ranzato

Memory-Efficient Backpropagation Through Time

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs). Our approach uses dynamic programming to balance a trade-off between caching of…

Neural and Evolutionary Computing · Computer Science 2016-06-13 Audrūnas Gruslys , Remi Munos , Ivo Danihelka , Marc Lanctot , Alex Graves

Backward Lens: Projecting Language Model Gradients into the Vocabulary Space

Understanding how Transformer-based Language Models (LMs) learn and recall information is a key goal of the deep learning community. Recent interpretability methods project weights and hidden states obtained from the forward pass to the…

Computation and Language · Computer Science 2024-02-21 Shahar Katz , Yonatan Belinkov , Mor Geva , Lior Wolf

A Gradient-Interleaved Scheduler for Energy-Efficient Backpropagation for Training Neural Networks

This paper addresses design of accelerators using systolic architectures for training of neural networks using a novel gradient interleaving approach. Training the neural network involves backpropagation of error and computation of…

Signal Processing · Electrical Eng. & Systems 2023-02-27 Nanda Unnikrishnan , Keshab K. Parhi

Tensor-Based Backpropagation in Neural Networks with Non-Sequential Input

Neural networks have been able to achieve groundbreaking accuracy at tasks conventionally considered only doable by humans. Using stochastic gradient descent, optimization in many dimensions is made possible, albeit at a relatively high…

Machine Learning · Computer Science 2017-07-17 Hirsh R. Agarwal , Andrew Huang

Progressive Latent Replay for efficient Generative Rehearsal

We introduce a new method for internal replay that modulates the frequency of rehearsal based on the depth of the network. While replay strategies mitigate the effects of catastrophic forgetting in neural networks, recent works on…

Computer Vision and Pattern Recognition · Computer Science 2022-07-07 Stanisław Pawlak , Filip Szatkowski , Michał Bortkiewicz , Jan Dubiński , Tomasz Trzciński

A comparative study of back propagation and its alternatives on multilayer perceptrons

The de facto algorithm for training the back pass of a feedforward neural network is backpropagation (BP). The use of almost-everywhere differentiable activation functions made it efficient and effective to propagate the gradient backwards…

Neural and Evolutionary Computing · Computer Science 2022-06-14 John Waldo

Deep Recurrent Neural Networks for Time Series Prediction

Ability of deep networks to extract high level features and of recurrent networks to perform time-series inference have been studied. In view of universality of one hidden layer network at approximating functions under weak constraints, the…

Neural and Evolutionary Computing · Computer Science 2014-12-19 Sharat C. Prasad , Piyush Prasad

The Reversible Residual Network: Backpropagation Without Storing Activations

Deep residual networks (ResNets) have significantly pushed forward the state-of-the-art on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck, as one…

Computer Vision and Pattern Recognition · Computer Science 2017-07-18 Aidan N. Gomez , Mengye Ren , Raquel Urtasun , Roger B. Grosse