Related papers: Learning with Differentiable Perturbed Optimizers

Predictor-corrector algorithms for stochastic optimization under gradual distribution shift

Time-varying stochastic optimization problems frequently arise in machine learning practice (e.g. gradual domain shift, object tracking, strategic classification). Although most problems are solved in discrete time, the underlying process…

Machine Learning · Computer Science 2023-02-24 Subha Maity , Debarghya Mukherjee , Moulinath Banerjee , Yuekai Sun

Tasks, stability, architecture, and compute: Training more effective learned optimizers, and using them to train themselves

Much as replacing hand-designed features with learned functions has revolutionized how we solve perceptual tasks, we believe learned algorithms will transform how we train models. In this work we focus on general-purpose learned optimizers…

Machine Learning · Computer Science 2020-09-24 Luke Metz , Niru Maheswaranathan , C. Daniel Freeman , Ben Poole , Jascha Sohl-Dickstein

A Closer Look at Learned Optimization: Stability, Robustness, and Inductive Biases

Learned optimizers -- neural networks that are trained to act as optimizers -- have the potential to dramatically accelerate training of machine learning models. However, even when meta-trained across thousands of tasks at huge…

Machine Learning · Computer Science 2022-09-23 James Harrison , Luke Metz , Jascha Sohl-Dickstein

Narrowing the Focus: Learned Optimizers for Pretrained Models

In modern deep learning, the models are learned by applying gradient updates using an optimizer, which transforms the updates based on various statistics. Optimizers are often hand-designed and tuning their hyperparameters is a big part of…

Machine Learning · Computer Science 2024-10-08 Gus Kristiansen , Mark Sandler , Andrey Zhmoginov , Nolan Miller , Anirudh Goyal , Jihwan Lee , Max Vladymyrov

Investigation into the Training Dynamics of Learned Optimizers

Optimization is an integral part of modern deep learning. Recently, the concept of learned optimizers has emerged as a way to accelerate this optimization process by replacing traditional, hand-crafted algorithms with meta-learned…

Machine Learning · Computer Science 2023-12-13 Jan Sobotka , Petr Šimánek , Daniel Vašata

Learning Randomly Perturbed Structured Predictors for Direct Loss Minimization

Direct loss minimization is a popular approach for learning predictors over structured label spaces. This approach is computationally appealing as it replaces integration with optimization and allows to propagate gradients in a deep net…

Machine Learning · Statistics 2021-06-15 Hedda Cohen Indelman , Tamir Hazan

Differentiable Rendering with Perturbed Optimizers

Reasoning about 3D scenes from their 2D image projections is one of the core problems in computer vision. Solutions to this inverse and ill-posed problem typically involve a search for models that best explain observed image data. Notably,…

Computer Vision and Pattern Recognition · Computer Science 2021-10-19 Quentin Le Lidec , Ivan Laptev , Cordelia Schmid , Justin Carpentier

Melding the Data-Decisions Pipeline: Decision-Focused Learning for Combinatorial Optimization

Creating impact in real-world settings requires artificial intelligence techniques to span the full pipeline from data, to predictive models, to decisions. These components are typically approached separately: a machine learning model is…

Machine Learning · Computer Science 2018-11-22 Bryan Wilder , Bistra Dilkina , Milind Tambe

Reverse engineering learned optimizers reveals known and novel mechanisms

Learned optimizers are algorithms that can themselves be trained to solve optimization problems. In contrast to baseline optimizers (such as momentum or Adam) that use simple update rules derived from theoretical principles, learned…

Machine Learning · Computer Science 2021-12-09 Niru Maheswaranathan , David Sussillo , Luke Metz , Ruoxi Sun , Jascha Sohl-Dickstein

Learning Gradient Descent: Better Generalization and Longer Horizons

Training deep neural networks is a highly nontrivial task, involving carefully selecting appropriate training algorithms, scheduling step sizes and tuning other hyperparameters. Trying different combinations can be quite labor-intensive and…

Machine Learning · Computer Science 2017-06-13 Kaifeng Lv , Shunhua Jiang , Jian Li

Learning with Differentiable Algorithms

Classic algorithms and machine learning systems like neural networks are both abundant in everyday life. While classic computer science algorithms are suitable for precise execution of exactly defined tasks such as finding the shortest path…

Machine Learning · Computer Science 2022-09-02 Felix Petersen

Optimization Learning

This article introduces the concept of optimization learning, a methodology to design optimization proxies that learn the input/output mapping of parametric optimization problems. These optimization proxies are trustworthy by design: they…

Optimization and Control · Mathematics 2025-01-08 Pascal Van Hentenryck

Understanding and correcting pathologies in the training of learned optimizers

Deep learning has shown that learned functions can dramatically outperform hand-designed functions on perceptual tasks. Analogously, this suggests that learned optimizers may similarly outperform current hand-designed optimizers, especially…

Neural and Evolutionary Computing · Computer Science 2019-06-11 Luke Metz , Niru Maheswaranathan , Jeremy Nixon , C. Daniel Freeman , Jascha Sohl-Dickstein

Data-driven Algorithm Selection and Parameter Tuning: Two Case studies in Optimization and Signal Processing

Machine learning algorithms typically rely on optimization subroutines and are well-known to provide very effective outcomes for many types of problems. Here, we flip the reliance and ask the reverse question: can machine learning…

Machine Learning · Computer Science 2019-07-30 Jesus A. De Loera , Jamie Haddock , Anna Ma , Deanna Needell

Metric Learning to Accelerate Convergence of Operator Splitting Methods for Differentiable Parametric Programming

Recent work has shown a variety of ways in which machine learning can be used to accelerate the solution of constrained optimization problems. Increasing demand for real-time decision-making capabilities in applications such as artificial…

Machine Learning · Computer Science 2024-04-02 Ethan King , James Kotary , Ferdinando Fioretto , Jan Drgona

Optimizers Qualitatively Alter Solutions And We Should Leverage This

Due to the nonlinear nature of Deep Neural Networks (DNNs), one can not guarantee convergence to a unique global minimum of the loss when using optimizers relying only on local information, such as SGD. Indeed, this was a primary source of…

Machine Learning · Computer Science 2025-07-17 Razvan Pascanu , Clare Lyle , Ionut-Vlad Modoranu , Naima Elosegui Borras , Dan Alistarh , Petar Velickovic , Sarath Chandar , Soham De , James Martens

Stochastic optimization with decision-dependent distributions

Stochastic optimization problems often involve data distributions that change in reaction to the decision variables. This is the case for example when members of the population respond to a deployed classifier by manipulating their features…

Optimization and Control · Mathematics 2020-12-15 Dmitriy Drusvyatskiy , Lin Xiao

Adaptive Optimization Algorithms for Machine Learning

Machine learning assumes a pivotal role in our data-driven world. The increasing scale of models and datasets necessitates quick and reliable algorithms for model training. This dissertation investigates adaptivity in machine learning…

Machine Learning · Computer Science 2023-11-20 Slavomír Hanzely

Accelerating Optimization via Differentiable Stopping Time

Optimization is an important module of modern machine learning applications. Tremendous efforts have been made to accelerate optimization algorithms. A common formulation is achieving a lower loss at a given time. This enables a…

Machine Learning · Computer Science 2025-05-29 Zhonglin Xie , Yiman Fong , Haoran Yuan , Zaiwen Wen

Towards Robust, Locally Linear Deep Networks

Deep networks realize complex mappings that are often understood by their locally linear behavior at or around points of interest. For example, we use the derivative of the mapping with respect to its inputs for sensitivity analysis, or to…

Machine Learning · Computer Science 2019-07-09 Guang-He Lee , David Alvarez-Melis , Tommi S. Jaakkola