Related papers: ADAM Optimization with Adaptive Batch Selection

Adam with Bandit Sampling for Deep Learning

Adam is a widely used optimization method for training deep learning models. It computes individual adaptive learning rates for different parameters. In this paper, we propose a generalization of Adam, called Adambs, that allows us to also…

Machine Learning · Computer Science 2020-10-27 Rui Liu , Tianyi Wu , Barzan Mozafari

Improving Portfolio Optimization Results with Bandit Networks

In Reinforcement Learning (RL), multi-armed Bandit (MAB) problems have found applications across diverse domains such as recommender systems, healthcare, and finance. Traditional MAB algorithms typically assume stationary reward…

Artificial Intelligence · Computer Science 2024-10-10 Gustavo de Freitas Fonseca , Lucas Coelho e Silva , Paulo André Lima de Castro

AdaptiveBandit: A multi-armed bandit framework for adaptive sampling in molecular simulations

Sampling from the equilibrium distribution has always been a major problem in molecular simulations due to the very high dimensionality of conformational space. Over several decades, many approaches have been used to overcome the problem.…

Computational Physics · Physics 2020-03-02 Adrià Pérez , Pablo Herrera-Nieto , Stefan Doerr , Gianni De Fabritiis

Towards Practical Adam: Non-Convexity, Convergence Theory, and Mini-Batch Acceleration

Adam is one of the most influential adaptive stochastic algorithms for training deep neural networks, which has been pointed out to be divergent even in the simple convex setting via a few simple counterexamples. Many attempts, such as…

Machine Learning · Computer Science 2022-08-09 Congliang Chen , Li Shen , Fangyu Zou , Wei Liu

CAdam: Confidence-Based Optimization for Online Learning

Modern recommendation systems frequently employ online learning to dynamically update their models with freshly collected data. The most commonly used optimizer for updating neural networks in these contexts is the Adam optimizer, which…

Machine Learning · Computer Science 2025-06-05 Shaowen Wang , Anan Liu , Jian Xiao , Huan Liu , Yuekui Yang , Cong Xu , Qianqian Pu , Suncong Zheng , Wei Zhang , Di Wang , Jie Jiang , Jian Li

On the Trend-corrected Variant of Adaptive Stochastic Optimization Methods

Adam-type optimizers, as a class of adaptive moment estimation methods with the exponential moving average scheme, have been successfully used in many applications of deep learning. Such methods are appealing due to the capability on…

Machine Learning · Computer Science 2020-12-17 Bingxin Zhou , Xuebin Zheng , Junbin Gao

Adaptive Data Augmentation with Multi-armed Bandit: Sample-Efficient Embedding Calibration for Implicit Pattern Recognition

Recognizing implicit visual and textual patterns is essential in many real-world applications of modern AI. However, tackling long-tail pattern recognition tasks remains challenging for current pre-trained foundation models such as LLMs and…

Computer Vision and Pattern Recognition · Computer Science 2026-03-03 Minxue Tang , Yangyang Yu , Aolin Ding , Maziyar Baran Pouyan , Taha Belkhouja , Yujia Bao

Practical Bayesian Learning of Neural Networks via Adaptive Optimisation Methods

We introduce a novel framework for the estimation of the posterior distribution over the weights of a neural network, based on a new probabilistic interpretation of adaptive optimisation algorithms such as AdaGrad and Adam. We demonstrate…

Machine Learning · Statistics 2020-07-21 Samuel Kessler , Arnold Salas , Vincent W. C. Tan , Stefan Zohren , Stephen Roberts

A Theoretical and Experimental Study of a Novel Adaptive Learning Algorithm

A crucial component of machine learning algorithms is minimizing loss functions with less computational cost and less oscillations. While adaptive learning rate-based optimizers have been widely used for real-world tasks, they do not…

Machine Learning · Computer Science 2026-05-29 Sakshi Kumari , Shyam Kumar M , Sushmitha P

BADM: Batch ADMM for Deep Learning

Stochastic gradient descent-based algorithms are widely used for training deep neural networks but often suffer from slow convergence. To address the challenge, we leverage the framework of the alternating direction method of multipliers…

Machine Learning · Computer Science 2025-02-03 Ouya Wang , Shenglong Zhou , Geoffrey Ye Li

AdamZ: An Enhanced Optimisation Method for Neural Network Training

AdamZ is an advanced variant of the Adam optimiser, developed to enhance convergence efficiency in neural network training. This optimiser dynamically adjusts the learning rate by incorporating mechanisms to address overshooting and…

Machine Learning · Computer Science 2024-11-26 Ilia Zaznov , Atta Badii , Alfonso Dufour , Julian Kunkel

A Control Theoretic Framework for Adaptive Gradient Optimizers in Machine Learning

Adaptive gradient methods have become popular in optimizing deep neural networks; recent examples include AdaGrad and Adam. Although Adam usually converges faster, variations of Adam, for instance, the AdaBelief algorithm, have been…

Machine Learning · Computer Science 2024-10-29 Kushal Chakrabarti , Nikhil Chopra

Stochastic Optimization with Bandit Sampling

Many stochastic optimization algorithms work by estimating the gradient of the cost function on the fly by sampling datapoints uniformly at random from a training set. However, the estimator might have a large variance, which inadvertently…

Machine Learning · Computer Science 2017-08-10 Farnood Salehi , L. Elisa Celis , Patrick Thiran

Adam$^+$: A Stochastic Method with Adaptive Variance Reduction

Adam is a widely used stochastic optimization method for deep learning applications. While practitioners prefer Adam because it requires less parameter tuning, its use is problematic from a theoretical point of view since it may not…

Machine Learning · Computer Science 2020-11-25 Mingrui Liu , Wei Zhang , Francesco Orabona , Tianbao Yang

A Novel Convergence Analysis for Algorithms of the Adam Family

Since its invention in 2014, the Adam optimizer has received tremendous attention. On one hand, it has been widely used in deep learning and many variants have been proposed, while on the other hand their theoretical convergence property…

Machine Learning · Computer Science 2021-12-08 Zhishuai Guo , Yi Xu , Wotao Yin , Rong Jin , Tianbao Yang

Understanding the Generalization of Stochastic Gradient Adam in Learning Neural Networks

Adam is a popular and widely used adaptive gradient method in deep learning, which has also received tremendous focus in theoretical research. However, most existing theoretical work primarily analyzes its full-batch version, which differs…

Machine Learning · Computer Science 2025-10-14 Xuan Tang , Han Zhang , Yuan Cao , Difan Zou

Integrating Multi-Armed Bandit, Active Learning, and Distributed Computing for Scalable Optimization

Modern optimization problems in scientific and engineering domains often rely on expensive black-box evaluations, such as those arising in physical simulations or deep learning pipelines, where gradient information is unavailable or…

Computation · Statistics 2026-01-05 Foo Hui-Mean , Yuan-chin Ivan Chang

Stochastic Gradient Sampling for Enhancing Neural Networks Training

In this paper, we introduce StochGradAdam, a novel optimizer designed as an extension of the Adam algorithm, incorporating stochastic gradient sampling techniques to improve computational efficiency while maintaining robust performance.…

Machine Learning · Computer Science 2025-03-19 Juyoung Yun

An Improved Adaptive PID Optimizer with Enhanced Convergence and Stability for Deep Learning

Optimization is essential in deep learning. The foundational method upon which most optimizers are built is momentum-based stochastic gradient descent. However, it suffers from two key drawbacks. First, it has noisy and varying gradients,…

Machine Learning · Computer Science 2026-05-22 Saurabh Saini , Kapil Ahuja , Thomas Wick , Saurav Kumar

Combinatorial Allocation Bandits with Nonlinear Arm Utility

A matching platform is a system that matches different types of participants, such as companies and job-seekers. In such a platform, merely maximizing the number of matches can result in matches being concentrated on highly popular…

Machine Learning · Computer Science 2026-03-10 Yuki Shibukawa , Koichi Tanaka , Yuta Saito , Shinji Ito