Related papers: Learning to Optimize Neural Nets

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Learned Optimizers that Scale and Generalize

Learning to learn has emerged as an important direction for achieving artificial intelligence. Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks. We…

Machine Learning · Computer Science 2017-09-11 Olga Wichrowska , Niru Maheswaranathan , Matthew W. Hoffman , Sergio Gomez Colmenarejo , Misha Denil , Nando de Freitas , Jascha Sohl-Dickstein

Neural Optimizer Search with Reinforcement Learning

We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that…

Artificial Intelligence · Computer Science 2017-09-25 Irwan Bello , Barret Zoph , Vijay Vasudevan , Quoc V. Le

A Comparison of Optimization Algorithms for Deep Learning

In recent years, we have witnessed the rise of deep learning. Deep neural networks have proved their success in many areas. However, the optimization of these networks has become more difficult as neural networks going deeper and datasets…

Machine Learning · Computer Science 2020-08-05 Derya Soydaner

Can Learned Optimization Make Reinforcement Learning Less Difficult?

While reinforcement learning (RL) holds great potential for decision making in the real world, it suffers from a number of unique difficulties which often need specific consideration. In particular: it is highly non-stationary; suffers from…

Machine Learning · Computer Science 2025-04-16 Alexander David Goldie , Chris Lu , Matthew Thomas Jackson , Shimon Whiteson , Jakob Nicolaus Foerster

Investigation into the Training Dynamics of Learned Optimizers

Optimization is an integral part of modern deep learning. Recently, the concept of learned optimizers has emerged as a way to accelerate this optimization process by replacing traditional, hand-crafted algorithms with meta-learned…

Machine Learning · Computer Science 2023-12-13 Jan Sobotka , Petr Šimánek , Daniel Vašata

Qualitatively characterizing neural network optimization problems

Training neural networks involves solving large-scale non-convex optimization problems. This task has long been believed to be extremely difficult, with fear of local minima and other obstacles motivating a variety of schemes to improve…

Neural and Evolutionary Computing · Computer Science 2015-05-25 Ian J. Goodfellow , Oriol Vinyals , Andrew M. Saxe

Learning to Optimize in Swarms

Learning to optimize has emerged as a powerful framework for various optimization and machine learning tasks. Current such "meta-optimizers" often learn in the space of continuous optimization algorithms that are point-based and…

Machine Learning · Computer Science 2019-11-19 Yue Cao , Tianlong Chen , Zhangyang Wang , Yang Shen

Dynamic Optimization of Neural Network Structures Using Probabilistic Modeling

Deep neural networks (DNNs) are powerful machine learning models and have succeeded in various artificial intelligence tasks. Although various architectures and modules for the DNNs have been proposed, selecting and designing the…

Neural and Evolutionary Computing · Computer Science 2018-01-24 Shinichi Shirakawa , Yasushi Iwata , Youhei Akimoto

Learning to Defend by Learning to Attack

Adversarial training provides a principled approach for training robust neural networks. From an optimization perspective, adversarial training is essentially solving a bilevel optimization problem. The leader problem is trying to learn a…

Machine Learning · Computer Science 2021-05-04 Haoming Jiang , Zhehui Chen , Yuyang Shi , Bo Dai , Tuo Zhao

A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases

Learned optimizers -- neural networks that are trained to act as optimizers -- have the potential to dramatically accelerate training of machine learning models. However, even when meta-trained across thousands of tasks at huge…

Machine Learning · Computer Science 2022-09-23 James Harrison , Luke Metz , Jascha Sohl-Dickstein

Narrowing the Focus: Learned Optimizers for Pretrained Models

In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics. Optimizers are often hand-designed and tuning their hyperparameters is a big part of…

Machine Learning · Computer Science 2024-10-08 Gus Kristiansen , Mark Sandler , Andrey Zhmoginov , Nolan Miller , Anirudh Goyal , Jihwan Lee , Max Vladymyrov

Optimization Methods for Supervised Machine Learning: From Linear Models to Deep Learning

The goal of this tutorial is to introduce key models, algorithms, and open questions related to the use of optimization methods for solving problems arising in machine learning. It is written with an INFORMS audience in mind, specifically…

Machine Learning · Statistics 2017-07-03 Frank E. Curtis , Katya Scheinberg

Learning to Optimize for Reinforcement Learning

In recent years, by leveraging more data, computation, and diverse tasks, learned optimizers have achieved remarkable success in supervised learning, outperforming classical hand-designed optimizers. Reinforcement learning (RL) is…

Machine Learning · Computer Science 2024-06-05 Qingfeng Lan , A. Rupam Mahmood , Shuicheng Yan , Zhongwen Xu

Training Neural Networks at Any Scale

This article reviews modern optimization methods for training neural networks with an emphasis on efficiency and scale. We present state-of-the-art optimization algorithms under a unified algorithmic template that highlights the importance…

Machine Learning · Computer Science 2025-11-17 Thomas Pethick , Kimon Antonakopoulos , Antonio Silveti-Falls , Leena Chennuru Vankadara , Volkan Cevher

Learning Hard Optimization Problems: A Data Generation Perspective

Optimization problems are ubiquitous in our societies and are present in almost every segment of the economy. Most of these optimization problems are NP-hard and computationally demanding, often requiring approximate solutions for…

Optimization and Control · Mathematics 2021-06-23 James Kotary , Ferdinando Fioretto , Pascal Van Hentenryck

Robust Optimization Framework for Training Shallow Neural Networks Using Reachability Method

In this paper, a robust optimization framework is developed to train shallow neural networks based on reachability analysis of neural networks. To characterize noises of input data, the input training data is disturbed in the description of…

Machine Learning · Computer Science 2021-07-28 Yejiang Yang , Weiming Xiang

Greedy Learning to Optimize with Convergence Guarantees

Learning to optimize is an approach that leverages training data to accelerate the solution of optimization problems. Many approaches use unrolling to parametrize the update step and learn optimal parameters. Although L2O has shown…

Optimization and Control · Mathematics 2025-07-15 Patrick Fahy , Mohammad Golbabaee , Matthias J. Ehrhardt

Understanding and correcting pathologies in the training of learned optimizers

Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially…

Neural and Evolutionary Computing · Computer Science 2019-06-11 Luke Metz , Niru Maheswaranathan , Jeremy Nixon , C. Daniel Freeman , Jascha Sohl-Dickstein

A Bridge Between Hyperparameter Optimization and Learning-to-learn

We consider a class of a nested optimization problems involving inner and outer objectives. We observe that by taking into explicit account the optimization dynamics for the inner objective it is possible to derive a general framework that…

Machine Learning · Statistics 2019-08-22 Luca Franceschi , Michele Donini , Paolo Frasconi , Massimiliano Pontil