Related papers: Gradient Normalization & Depth Based Decay For Dee…

Backward Gradient Normalization in Deep Neural Networks

We introduce a new technique for gradient normalization during neural network training. The gradients are rescaled during the backward pass using normalization layers introduced at certain points within the network architecture. These…

Machine Learning · Computer Science 2021-06-18 Alejandro Cabana , Luis F. Lago-Fernández

Penalizing Gradient Norm for Efficiently Improving Generalization in Deep Learning

How to train deep neural networks (DNNs) to generalize well is a central concern in deep learning, especially for severely overparameterized networks nowadays. In this paper, we propose an effective method to improve the model…

Machine Learning · Computer Science 2022-06-28 Yang Zhao , Hao Zhang , Xiuyuan Hu

Generalized Deep Learning-based Proximal Gradient Descent for MR Reconstruction

The data consistency for the physical forward model is crucial in inverse problems, especially in MR imaging reconstruction. The standard way is to unroll an iterative algorithm into a neural network with a forward model embedded. The…

Image and Video Processing · Electrical Eng. & Systems 2023-06-28 Guanxiong Luo , Mengmeng Kuang , Peng Cao

A Comprehensive Study on Optimization Strategies for Gradient Descent In Deep Learning

One of the most important parts of Artificial Neural Networks is minimizing the loss functions which tells us how good or bad our model is. To minimize these losses we need to tune the weights and biases. Also to calculate the minimum value…

Machine Learning · Computer Science 2021-01-08 Kaustubh Yadav

Layerwise Optimization by Gradient Decomposition for Continual Learning

Deep neural networks achieve state-of-the-art and sometimes super-human performance across various domains. However, when learning tasks sequentially, the networks easily forget the knowledge of previous tasks, known as "catastrophic…

Computer Vision and Pattern Recognition · Computer Science 2021-05-18 Shixiang Tang , Dapeng Chen , Jinguo Zhu , Shijie Yu , Wanli Ouyang

Learning Deep Gradient Descent Optimization for Image Deconvolution

As an integral component of blind image deblurring, non-blind deconvolution removes image blur with a given blur kernel, which is essential but difficult due to the ill-posed nature of the inverse problem. The predominant approach is based…

Computer Vision and Pattern Recognition · Computer Science 2020-02-18 Dong Gong , Zhen Zhang , Qinfeng Shi , Anton van den Hengel , Chunhua Shen , Yanning Zhang

Channel Normalization in Convolutional Neural Network avoids Vanishing Gradients

Normalization layers are widely used in deep neural networks to stabilize training. In this paper, we consider the training of convolutional neural networks with gradient descent on a single training example. This optimization problem…

Machine Learning · Computer Science 2019-07-24 Zhenwei Dai , Reinhard Heckel

Batch-less stochastic gradient descent for compressive learning of deep regularization for image denoising

We consider the problem of denoising with the help of prior information taken from a database of clean signals or images. Denoising with variational methods is very efficient if a regularizer well adapted to the nature of the data is…

Machine Learning · Computer Science 2023-10-06 Hui Shi , Yann Traonmilin , J-F Aujol

PathProx: A Proximal Gradient Algorithm for Weight Decay Regularized Deep Neural Networks

Weight decay is one of the most widely used forms of regularization in deep learning, and has been shown to improve generalization and robustness. The optimization objective driving weight decay is a sum of losses plus a term proportional…

Machine Learning · Computer Science 2023-07-07 Liu Yang , Jifan Zhang , Joseph Shenouda , Dimitris Papailiopoulos , Kangwook Lee , Robert D. Nowak

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

In this paper, we aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Deep learning models involving multiple nonlinear projection layers are very challenging to…

Machine Learning · Computer Science 2019-03-12 Jiawei Zhang

A Novel Structured Natural Gradient Descent for Deep Learning

Natural gradient descent (NGD) provided deep insights and powerful tools to deep neural networks. However the computation of Fisher information matrix becomes more and more difficult as the network structure turns large and complex. This…

Machine Learning · Computer Science 2021-09-22 Weihua Liu , Xiabi Liu

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Gradient Regularization Improves Accuracy of Discriminative Models

Regularizing the gradient norm of the output of a neural network with respect to its inputs is a powerful technique, rediscovered several times. This paper presents evidence that gradient regularization can consistently improve…

Machine Learning · Computer Science 2018-05-28 Dániel Varga , Adrián Csiszárik , Zsolt Zombori

Analysis of Natural Gradient Descent for Multilayer Neural Networks

Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter space to redefine the direction of steepest descent. The algorithm is examined via methods…

Disordered Systems and Neural Networks · Physics 2009-10-31 Magnus Rattray , David Saad

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

Recent RGB-guided depth super-resolution methods have achieved impressive performance under the assumption of fixed and known degradation (e.g., bicubic downsampling). However, in real-world scenarios, captured depth data often suffer from…

Computer Vision and Pattern Recognition · Computer Science 2025-03-20 Zhengxue Wang , Zhiqiang Yan , Jinshan Pan , Guangwei Gao , Kai Zhang , Jian Yang

Input Normalized Stochastic Gradient Descent Training of Deep Neural Networks

In this paper, we propose a novel optimization algorithm for training machine learning models called Input Normalized Stochastic Gradient Descent (INSGD), inspired by the Normalized Least Mean Squares (NLMS) algorithm used in adaptive…

Machine Learning · Computer Science 2023-06-28 Salih Atici , Hongyi Pan , Ahmet Enis Cetin

Adding Gradient Noise Improves Learning for Very Deep Networks

Deep feedforward and recurrent networks have achieved impressive results in many perception and language processing applications. This success is partially attributed to architectural innovations such as convolutional and long short-term…

Machine Learning · Statistics 2015-11-24 Arvind Neelakantan , Luke Vilnis , Quoc V. Le , Ilya Sutskever , Lukasz Kaiser , Karol Kurach , James Martens

Adaptive Gradient Regularization: A Faster and Generalizable Optimization Technique for Deep Neural Networks

Stochastic optimization plays a crucial role in the advancement of deep learning technologies. Over the decades, significant effort has been dedicated to improving the training efficiency and robustness of deep neural networks, via various…

Machine Learning · Computer Science 2024-08-21 Huixiu Jiang , Ling Yang , Yu Bao , Rutong Si , Sikun Yang

Gradient Normalization for Generative Adversarial Networks

In this paper, we propose a novel normalization method called gradient normalization (GN) to tackle the training instability of Generative Adversarial Networks (GANs) caused by the sharp gradient space. Unlike existing work such as gradient…

Machine Learning · Computer Science 2021-10-12 Yi-Lun Wu , Hong-Han Shuai , Zhi-Rui Tam , Hong-Yu Chiu

Beyond Gradient Descent for Regularized Segmentation Losses

The simplicity of gradient descent (GD) made it the default method for training ever-deeper and complex neural networks. Both loss functions and architectures are often explicitly tuned to be amenable to this basic local optimization. In…

Machine Learning · Computer Science 2019-04-30 Dmitrii Marin , Meng Tang , Ismail Ben Ayed , Yuri Boykov