English
Related papers

Related papers: Compositional ADAM: An Adaptive Compositional Solv…

200 papers

Adaptive Moment Estimation (ADAM) is a very popular training algorithm for deep neural networks and belongs to the family of adaptive gradient descent optimizers. However to the best of the authors knowledge no complete convergence analysis…

Machine Learning · Computer Science 2021-02-22 Sebastian Bock , Martin Georg Weiß

Compositionality is a basic structural feature of both biological and artificial neural networks. Learning compositional functions via gradient descent incurs well known problems like vanishing and exploding gradients, making careful…

Neural and Evolutionary Computing · Computer Science 2021-01-11 Jeremy Bernstein , Jiawei Zhao , Markus Meister , Ming-Yu Liu , Anima Anandkumar , Yisong Yue

Adaptive gradient methods have shown excellent performances for solving many machine learning problems. Although multiple adaptive gradient methods were recently studied, they mainly focus on either empirical or theoretical aspects and also…

Optimization and Control · Mathematics 2022-05-13 Feihu Huang , Junyi Li , Heng Huang

Our objective is to estimate the unknown compositional input from its output response through an unknown system after estimating the inverse of the original system with a training set. The proposed methods using artificial neural networks…

Machine Learning · Computer Science 2020-01-27 Se Un Park

Recently, a number of learning-based optimization methods that combine data-driven architectures with the classical optimization algorithms have been proposed and explored, showing superior empirical performance in solving various ill-posed…

Machine Learning · Computer Science 2019-05-16 Xingyu Xie , Jianlong Wu , Zhisheng Zhong , Guangcan Liu , Zhouchen Lin

This paper develops an adaptive proximal alternating direction method of multipliers (ADMM) for solving linearly constrained, composite optimization problems under the assumption that the smooth component of the objective is weakly convex,…

Optimization and Control · Mathematics 2026-05-04 Leandro Farias Maia , David H. Gutman , Renato D. C. Monteiro , Gilson N. Silva

Computation methods for solving entropy-regularized reward optimization -- a class of problems widely used for fine-tuning generative models -- have advanced rapidly. Among those, Adjoint Matching (AM, Domingo-Enrich et al., 2025) has…

Machine Learning · Statistics 2026-02-17 Oswin So , Brian Karrer , Chuchu Fan , Ricky T. Q. Chen , Guan-Horng Liu

The Adaptive Momentum Estimation (Adam) algorithm is highly effective in training various deep learning tasks. Despite this, there's limited theoretical understanding for Adam, especially when focusing on its vanilla form in non-convex…

Optimization and Control · Mathematics 2025-02-25 Yusu Hong , Junhong Lin

Min-max saddle point games have recently been intensely studied, due to their wide range of applications, including training Generative Adversarial Networks (GANs). However, most of the recent efforts for solving them are limited to special…

Optimization and Control · Mathematics 2021-08-10 Babak Barazandeh , Tianjian Huang , George Michailidis

Adaptive algorithms like AdaGrad and AMSGrad are successful in nonconvex optimization owing to their parameter-agnostic ability -- requiring no a priori knowledge about problem-specific parameters nor tuning of learning rates. However, when…

Optimization and Control · Mathematics 2022-10-17 Junchi Yang , Xiang Li , Niao He

In deep learning, different kinds of deep networks typically need different optimizers, which have to be chosen after multiple trials, making the training process inefficient. To relieve this issue and consistently improve the model…

Machine Learning · Computer Science 2024-12-02 Xingyu Xie , Pan Zhou , Huan Li , Zhouchen Lin , Shuicheng Yan

Adaptive gradient-based optimization methods such as \textsc{Adagrad}, \textsc{Rmsprop}, and \textsc{Adam} are widely used in solving large-scale machine learning problems including deep learning. A number of schemes have been proposed in…

Machine Learning · Computer Science 2019-05-30 Parvin Nazari , Davoud Ataee Tarzanagh , George Michailidis

The alternating direction method of multipliers (ADMM) is commonly used for distributed model fitting problems, but its performance and reliability depend strongly on user-defined penalty parameters. We study distributed ADMM methods that…

Machine Learning · Computer Science 2017-06-21 Zheng Xu , Gavin Taylor , Hao Li , Mario Figueiredo , Xiaoming Yuan , Tom Goldstein

In this paper, we provide a rigorous proof of convergence of the Adaptive Moment Estimate (Adam) algorithm for a wide class of optimization objectives. Despite the popularity and efficiency of the Adam algorithm in training deep neural…

Optimization and Control · Mathematics 2023-11-08 Haochuan Li , Alexander Rakhlin , Ali Jadbabaie

Modern recommendation systems frequently employ online learning to dynamically update their models with freshly collected data. The most commonly used optimizer for updating neural networks in these contexts is the Adam optimizer, which…

Machine Learning · Computer Science 2025-06-05 Shaowen Wang , Anan Liu , Jian Xiao , Huan Liu , Yuekui Yang , Cong Xu , Qianqian Pu , Suncong Zheng , Wei Zhang , Di Wang , Jie Jiang , Jian Li

The paper derives analytical expressions for the asymptotic average updating direction of the adaptive moment generation (ADAM) algorithm when applied to recursive identification of nonlinear systems. It is proved that the standard…

Systems and Control · Electrical Eng. & Systems 2025-10-24 Torbjörn Wigren , Ruoqi Zhang , Per Mattsson

Stochastic compositional optimization generalizes classic (non-compositional) stochastic optimization to the minimization of compositions of functions. Each composition may introduce an additional expectation. The series of expectations may…

Optimization and Control · Mathematics 2021-09-29 Tianyi Chen , Yuejiao Sun , Wotao Yin

We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has…

Machine Learning · Computer Science 2017-01-31 Diederik P. Kingma , Jimmy Ba

Compressed Deep Learning (DL) models are essential for deployment in resource-constrained environments. But their performance often lags behind their large-scale counterparts. To bridge this gap, we propose Alignment Adapter (AlAd): a…

Machine Learning · Computer Science 2026-02-17 Rohit Raj Rai , Abhishek Dhaka , Amit Awekar

Optimization algorithms with momentum, e.g., (ADAM), have been widely used for building deep learning models due to the faster convergence rates compared with stochastic gradient descent (SGD). Momentum helps accelerate SGD in the relevant…

Machine Learning · Computer Science 2020-01-24 Jiyang Bai , Yuxiang Ren , Jiawei Zhang
‹ Prev 1 2 3 10 Next ›