English
Related papers

Related papers: Semi-Implicit Back Propagation

200 papers

State-of-the-art training algorithms for deep learning models are based on stochastic gradient descent (SGD). Recently, many variations have been explored: perturbing parameters for better accuracy (such as in Extragradient), limiting SGD…

Machine Learning · Computer Science 2022-03-23 Amirkeivan Mohtashami , Martin Jaggi , Sebastian U. Stich

Backpropagation (BP) is the most successful and widely used algorithm in deep learning. However, the computations required by BP are challenging to reconcile with known neurobiology. This difficulty has stimulated interest in more…

Neural and Evolutionary Computing · Computer Science 2022-06-02 Nick Alonso , Beren Millidge , Jeff Krichmar , Emre Neftci

We propose a memory efficient method, named Stochastic Backpropagation (SBP), for training deep neural networks on videos. It is based on the finding that gradients from incomplete execution for backpropagation can still effectively train…

Computer Vision and Pattern Recognition · Computer Science 2022-04-01 Feng Cheng , Mingze Xu , Yuanjun Xiong , Hao Chen , Xinyu Li , Wei Li , Wei Xia

Stochastic gradient descent (SGD) has achieved great success in training deep neural network, where the gradient is computed through back-propagation. However, the back-propagated values of different layers vary dramatically. This…

Machine Learning · Statistics 2018-02-28 Huishuai Zhang , Wei Chen , Tie-Yan Liu

Stochastic Gradient Descent (SGD) has proven to be remarkably effective in optimizing deep neural networks that employ ever-larger numbers of parameters. Yet, improving the efficiency of large-scale optimization remains a vital and highly…

Machine Learning · Computer Science 2020-11-11 Frithjof Gressmann , Zach Eaton-Rosen , Carlo Luschi

Arguably the biggest challenge in applying neural networks is tuning the hyperparameters, in particular the learning rate. The sensitivity to the learning rate is due to the reliance on backpropagation to train the network. In this paper we…

Machine Learning · Statistics 2018-08-08 Francois Fagan , Garud Iyengar

We propose proximal backpropagation (ProxProp) as a novel algorithm that takes implicit instead of explicit gradient steps to update the network parameters during neural network training. Our algorithm is motivated by the step size…

Machine Learning · Computer Science 2018-02-21 Thomas Frerix , Thomas Möllenhoff , Michael Moeller , Daniel Cremers

In this paper, we provide an in-depth study of Stochastic Backpropagation (SBP) when training deep neural networks for standard image classification and object detection tasks. During backward propagation, SBP calculates the gradients by…

Computer Vision and Pattern Recognition · Computer Science 2022-10-04 Jun Fang , Mingze Xu , Hao Chen , Bing Shuai , Zhuowen Tu , Joseph Tighe

Training Deep Neural Networks (DNNs) with small batches using Stochastic Gradient Descent (SGD) yields superior test performance compared to larger batches. The specific noise structure inherent to SGD is known to be responsible for this…

Machine Learning · Statistics 2024-02-14 Tom Sander , Maxime Sylvestre , Alain Durmus

We showcase important features of the dynamics of the Stochastic Gradient Descent (SGD) in the training of neural networks. We present empirical observations that commonly used large step sizes (i) lead the iterates to jump from one side of…

Machine Learning · Computer Science 2023-06-08 Maksym Andriushchenko , Aditya Varre , Loucas Pillaud-Vivien , Nicolas Flammarion

Neural network optimization remains one of the most consequential yet poorly understood challenges in modern AI research, where improvements in training algorithms can lead to enhanced feature learning in foundation models,…

Machine Learning · Computer Science 2025-12-23 Ansh Nagwekar

Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. However, the supervised training of SNNs remains a hard problem due to the discontinuity of the spiking neuron…

Neural and Evolutionary Computing · Computer Science 2021-12-20 Mingqing Xiao , Qingyan Meng , Zongpeng Zhang , Yisen Wang , Zhouchen Lin

The Spiking Neural Network (SNN) is a biologically inspired neural network infrastructure that has recently garnered significant attention. It utilizes binary spike activations to transmit information, thereby replacing multiplications with…

Computer Vision and Pattern Recognition · Computer Science 2024-10-01 Yufei Guo , Yuanpei Chen , Zecheng Hao , Weihang Peng , Zhou Jie , Yuhan Zhang , Xiaode Liu , Zhe Ma

The back-propagation (BP) algorithm has been considered the de-facto method for training deep neural networks. It back-propagates errors from the output layer to the hidden layers in an exact manner using the transpose of the feedforward…

Neural and Evolutionary Computing · Computer Science 2018-05-01 Hongyin Luo , Jie Fu , James Glass

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized…

Machine Learning · Computer Science 2020-12-23 Jonathan Ashbrock , Alexander M. Powell

The stochastic gradient descent (SGD) algorithm is the algorithm we use to train neural networks. However, it remains poorly understood how the SGD navigates the highly nonlinear and degenerate loss landscape of a neural network. In this…

Machine Learning · Computer Science 2025-06-13 Liu Ziyin , Hongchao Li , Masahito Ueda

Neural network learning is usually time-consuming since backpropagation needs to compute full gradients and backpropagate them across multiple layers. Despite its success of existing works in accelerating propagation through sparseness, the…

Machine Learning · Computer Science 2020-10-28 Zhiyuan Zhang , Pengcheng Yang , Xuancheng Ren , Qi Su , Xu Sun

Artificial neural networks are most commonly trained with the back-propagation algorithm, where the gradient for learning is provided by back-propagating the error, layer by layer, from the output layer to the hidden layers. A recently…

Machine Learning · Statistics 2016-12-22 Arild Nøkland

Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems, but they are still trapped in training failures when the target functions to be approximated exhibit…

Machine Learning · Computer Science 2023-03-06 Ye Li , Song-Can Chen , Sheng-Jun Huang

Stochastic gradient descent (SGD) has been the dominant optimization method for training deep neural networks due to its many desirable properties. One of the more remarkable and least understood quality of SGD is that it generalizes…

Machine Learning · Computer Science 2020-07-03 Erhan Bilal
‹ Prev 1 2 3 10 Next ›