Related papers: Gradient Monitored Reinforcement Learning

Adaptive Gradient Method with Resilience and Momentum

Several variants of stochastic gradient descent (SGD) have been proposed to improve the learning effectiveness and efficiency when training deep neural networks, among which some recent influential attempts would like to adaptively control…

Machine Learning · Computer Science 2020-10-22 Jie Liu , Chen Lin , Chuming Li , Lu Sheng , Ming Sun , Junjie Yan , Wanli Ouyang

An Adaptive Gradient Method with Energy and Momentum

We introduce a novel algorithm for gradient-based optimization of stochastic objective functions. The method may be seen as a variant of SGD with momentum equipped with an adaptive learning rate automatically adjusted by an 'energy'…

Optimization and Control · Mathematics 2022-03-24 Hailiang Liu , Xuping Tian

Classifier-guided Gradient Modulation for Enhanced Multimodal Learning

Multimodal learning has developed very fast in recent years. However, during the multimodal training process, the model tends to rely on only one modality based on which it could learn faster, thus leading to inadequate use of other…

Machine Learning · Computer Science 2024-11-05 Zirun Guo , Tao Jin , Jingyuan Chen , Zhou Zhao

Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Training deep neural networks remains computationally intensive due to the itera2 tive nature of gradient-based optimization. We propose Gradient Flow Matching (GFM), a continuous-time modeling framework that treats neural network training…

Machine Learning · Computer Science 2025-05-27 Xiao Shou , Yanna Ding , Jianxi Gao

Adaptive Gradient Regularization: A Faster and Generalizable Optimization Technique for Deep Neural Networks

Stochastic optimization plays a crucial role in the advancement of deep learning technologies. Over the decades, significant effort has been dedicated to improving the training efficiency and robustness of deep neural networks, via various…

Machine Learning · Computer Science 2024-08-21 Huixiu Jiang , Ling Yang , Yu Bao , Rutong Si , Sikun Yang

Asymmetric Momentum: A Rethinking of Gradient Descent

Through theoretical and experimental validation, unlike all existing adaptive methods like Adam which penalize frequently-changing parameters and are only applicable to sparse gradients, we propose the simplest SGD enhanced method,…

Machine Learning · Computer Science 2023-10-04 Gongyue Zhang , Dinghuang Zhang , Shuwen Zhao , Donghan Liu , Carrie M. Toptan , Honghai Liu

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

In this paper, we aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Deep learning models involving multiple nonlinear projection layers are very challenging to…

Machine Learning · Computer Science 2019-03-12 Jiawei Zhang

Adaptive Memory Momentum via a Model-Based Framework for Deep Learning Optimization

The vast majority of modern deep learning models are trained with momentum-based first-order optimizers. The momentum term governs the optimizer's memory by determining how much each past gradient contributes to the current convergence…

Machine Learning · Computer Science 2026-05-12 Kristi Topollai , Anna Choromanska

Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks

Adaptive gradient methods, which adopt historical gradient information to automatically adjust the learning rate, despite the nice property of fast convergence, have been observed to generalize worse than stochastic gradient descent (SGD)…

Machine Learning · Computer Science 2020-06-24 Jinghui Chen , Dongruo Zhou , Yiqi Tang , Ziyan Yang , Yuan Cao , Quanquan Gu

Learning Multimodal Fixed-Point Weights using Gradient Descent

Due to their high computational complexity, deep neural networks are still limited to powerful processing units. To promote a reduced model complexity by dint of low-bit fixed-point quantization, we propose a gradient-based optimization…

Machine Learning · Computer Science 2019-07-18 Lukas Enderich , Fabian Timm , Lars Rosenbaum , Wolfram Burgard

Hindsight-Guided Momentum (HGM) Optimizer: An Approach to Adaptive Learning Rate

We introduce Hindsight-Guided Momentum (HGM), a first-order optimization algorithm that adaptively scales learning rates based on the directional consistency of recent updates. Traditional adaptive methods, such as Adam or RMSprop , adapt…

Optimization and Control · Mathematics 2025-07-01 Krisanu Sarkar

Adaptive Learning Rate and Momentum for Training Deep Neural Networks

Recent progress on deep learning relies heavily on the quality and efficiency of training algorithms. In this paper, we develop a fast training method motivated by the nonlinear Conjugate Gradient (CG) framework. We propose the Conjugate…

Machine Learning · Computer Science 2021-07-28 Zhiyong Hao , Yixuan Jiang , Huihua Yu , Hsiao-Dong Chiang

Accelerating Gradient Boosting Machine

Gradient Boosting Machine (GBM) is an extremely powerful supervised learning algorithm that is widely used in practice. GBM routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In this…

Machine Learning · Computer Science 2020-08-28 Haihao Lu , Sai Praneeth Karimireddy , Natalia Ponomareva , Vahab Mirrokni

Binomial Gradient-Based Meta-Learning for Enhanced Meta-Gradient Estimation

Meta-learning offers a principled framework leveraging \emph{task-invariant} priors from related tasks, with which \emph{task-specific} models can be fine-tuned on downstream tasks, even with limited data records. Gradient-based…

Machine Learning · Computer Science 2026-04-16 Yilang Zhang , Abraham Jaeger Mountain , Bingcong Li , Georgios B. Giannakis

Gradient Projection Memory for Continual Learning

The ability to learn continually without forgetting the past tasks is a desired attribute for artificial learning systems. Existing approaches to enable such learning in artificial neural networks usually rely on network growth, importance…

Machine Learning · Computer Science 2021-03-18 Gobinda Saha , Isha Garg , Kaushik Roy

A New Accelerated Stochastic Gradient Method with Momentum

In this paper, we propose a novel accelerated stochastic gradient method with momentum, which momentum is the weighted average of previous gradients. The weights decays inverse proportionally with the iteration times. Stochastic gradient…

Machine Learning · Computer Science 2020-06-02 Liang Liu , Xiaopeng Luo

Stochastic Gradient Descent with Nonlinear Conjugate Gradient-Style Adaptive Momentum

Momentum plays a crucial role in stochastic gradient-based optimization algorithms for accelerating or improving training deep neural networks (DNNs). In deep learning practice, the momentum is usually weighted by a well-calibrated…

Machine Learning · Computer Science 2020-12-04 Bao Wang , Qiang Ye

Sequential Training of Neural Networks with Gradient Boosting

This paper presents a novel technique based on gradient boosting to train the final layers of a neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a…

Machine Learning · Computer Science 2023-05-05 Seyedsaman Emami , Gonzalo Martínez-Muñoz

Reinforced stochastic gradient descent for deep neural network learning

Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the…

Machine Learning · Computer Science 2017-11-23 Haiping Huang , Taro Toyoizumi

Post Reinforcement Learning Inference

We study estimation and inference using data collected by reinforcement learning (RL) algorithms. These algorithms adaptively experiment by interacting with individual units over multiple stages, updating their strategies based on past…

Machine Learning · Statistics 2025-10-06 Vasilis Syrgkanis , Ruohan Zhan