English
Related papers

Related papers: Training Aware Sigmoidal Optimizer

200 papers

With increasing data and model complexities, the time required to train neural networks has become prohibitively large. To address the exponential rise in training time, users are turning to data parallel neural networks (DPNN) to utilize…

Machine Learning · Computer Science 2022-02-09 Daniel Coquelin , Charlotte Debus , Markus Götz , Fabrice von der Lehr , James Kahn , Martin Siggel , Achim Streit

Training neural networks on image datasets generally require extensive experimentation to find the optimal learning rate regime. Especially, for the cases of adversarial training or for training a newly synthesized model, one would not know…

Machine Learning · Computer Science 2019-10-28 Koyel Mukherjee , Alind Khare , Ashish Verma

Sharpness aware minimization (SAM) optimizer has been extensively explored as it can generalize better for training deep neural networks via introducing extra perturbation steps to flatten the landscape of deep learning models. Integrating…

Machine Learning · Computer Science 2023-03-02 Hao Sun , Li Shen , Qihuang Zhong , Liang Ding , Shixiang Chen , Jingwei Sun , Jing Li , Guangzhong Sun , Dacheng Tao

The concept of learning to optimize involves utilizing a trainable optimization strategy rather than relying on manually defined full gradient estimations such as ADAM. We present a framework that jointly trains the full gradient estimator…

Machine Learning · Computer Science 2026-01-30 Ruiqi Wang , Diego Klabjan

Fast gradient-based optimization algorithms have become increasingly essential for the computationally efficient training of machine learning models. One technique is to multiply the gradient by a preconditioner matrix to produce a step,…

Machine Learning · Computer Science 2023-09-12 Isaac Liao , Rumen R. Dangovski , Jakob N. Foerster , Marin Soljačić

Large-scale multimodal pre-trained models like CLIP rely heavily on high-quality training data, yet raw web-crawled datasets are often noisy, misaligned, and redundant, leading to inefficient training and suboptimal generalization. Existing…

Machine Learning · Computer Science 2026-02-06 Guanjie Cheng , Boyi Li , Lingyu Sun , Mengying Zhu , Yangyang Wu , Xinkui Zhao , Shuiguang Deng

The learning rate schedule is one of the most impactful aspects of neural network optimization, yet most schedules either follow simple parametric functions or react only to short-term training signals. None of them are supported by a…

Machine Learning · Computer Science 2025-09-30 Matt L. Sampson , Peter Melchior

We present Amos, a stochastic gradient-based optimizer designed for training deep neural networks. It can be viewed as an Adam optimizer with theoretically supported, adaptive learning-rate decay and weight decay. A key insight behind Amos…

Machine Learning · Computer Science 2022-11-22 Ran Tian , Ankur P. Parikh

The delta-bar-delta algorithm is recognized as a learning rate adaptation technique that enhances the convergence speed of the training process in optimization by dynamically scheduling the learning rate based on the difference between the…

Machine Learning · Computer Science 2023-10-18 Zhao Song , Chiwun Yang

This paper introduces Stress-Aware Learning, a resilient neural training paradigm in which deep neural networks dynamically adjust their optimization behavior - whether under stable training regimes or in settings with uncertain dynamics -…

Machine Learning · Computer Science 2025-08-04 Ashkan Shakarami , Yousef Yeganeh , Azade Farshad , Lorenzo Nicole , Stefano Ghidoni , Nassir Navab

Up to now, the training processes of typical Generative Adversarial Networks (GANs) are still particularly sensitive to data properties and hyperparameters, which may lead to severe oscillations, difficulties in convergence, or even…

Machine Learning · Computer Science 2025-04-22 Lin Wang , Xiancheng Wang , Rui Wang , Zhibo Zhang , Minghang Zhao

Recent focus on robustness to adversarial attacks for deep neural networks produced a large variety of algorithms for training robust models. Most of the effective algorithms involve solving the min-max optimization problem for training…

Machine Learning · Computer Science 2021-03-03 Yasaman Esfandiari , Aditya Balu , Keivan Ebrahimi , Umesh Vaidya , Nicola Elia , Soumik Sarkar

AdamZ is an advanced variant of the Adam optimiser, developed to enhance convergence efficiency in neural network training. This optimiser dynamically adjusts the learning rate by incorporating mechanisms to address overshooting and…

Machine Learning · Computer Science 2024-11-26 Ilia Zaznov , Atta Badii , Alfonso Dufour , Julian Kunkel

The increasing complexity of deep learning architectures is resulting in training time requiring weeks or even months. This slow training is due in part to vanishing gradients, in which the gradients used by back-propagation are extremely…

Computer Vision and Pattern Recognition · Computer Science 2015-10-16 Bharat Singh , Soham De , Yangmuzi Zhang , Thomas Goldstein , Gavin Taylor

First-order optimization methods, such as SGD and Adam, are widely used for training large-scale deep neural networks due to their computational efficiency and robust performance. However, relying solely on gradient information, these…

Machine Learning · Computer Science 2025-07-29 Yue Hu , Zanxia Cao , Yingchao Liu

Neural implicit mapping has emerged as a powerful paradigm for robotic navigation and scene understanding. However, real-world robotic deployment requires continual adaptation to changing environments under strict memory and computation…

Robotics · Computer Science 2026-05-29 Xunlan Zhou , Hongrui Zhao , Negar Mehr

Data-driven machine learning approaches have recently been proposed to facilitate wireless network optimization by learning latent knowledge from historical optimization instances. However, existing methods do not well handle the topology…

Networking and Internet Architecture · Computer Science 2021-01-06 Shuai Zhang , Bo Yin , Yu Cheng

While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach behind the success of deep learning to…

With the extensive applications of machine learning models, automatic hyperparameter optimization (HPO) has become increasingly important. Motivated by the tuning behaviors of human experts, it is intuitive to leverage auxiliary knowledge…

Machine Learning · Computer Science 2022-06-07 Yang Li , Yu Shen , Huaijun Jiang , Wentao Zhang , Zhi Yang , Ce Zhang , Bin Cui

Learning how to learn efficiently is a fundamental challenge for biological agents and a growing concern for artificial ones. To learn effectively, an agent must regulate its learning speed, balancing the benefits of rapid improvement…

Machine Learning · Computer Science 2026-01-13 Valentina Njaradi , Rodrigo Carrasco-Davis , Peter E. Latham , Andrew Saxe
‹ Prev 1 2 3 10 Next ›