Related papers: Scalable Nested Optimization for Deep Learning

Towards Differentiable Multilevel Optimization: A Gradient-Based Approach

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

A Review of Bilevel Optimization: Methods, Emerging Applications, and Recent Advancements

This paper presents a comprehensive review of techniques proposed in the literature for solving bilevel optimization problems encountered in various real-life applications. Bilevel optimization is an appropriate choice for hierarchical…

Optimization and Control · Mathematics 2025-11-06 Dhaval Pujara , Ankur Sinha

Bilevel learning

Bilevel learning refers to machine learning problems that can be formulated as bilevel optimization models, where decisions are organized in a hierarchical structure. This paradigm has recently gained considerable attention in machine…

Optimization and Control · Mathematics 2026-05-05 Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo , Alain Zemkoho

Gradient-based Bi-level Optimization for Deep Learning: A Survey

Bi-level optimization, especially the gradient-based category, has been widely used in the deep learning community including hyperparameter optimization and meta-knowledge extraction. Bi-level optimization embeds one problem within another…

Machine Learning · Computer Science 2023-07-11 Can Chen , Xi Chen , Chen Ma , Zixuan Liu , Xue Liu

Implicit Bilevel Optimization: Differentiating through Bilevel Optimization Programming

Bilevel Optimization Programming is used to model complex and conflicting interactions between agents, for example in Robust AI or Privacy-preserving AI. Integrating bilevel mathematical programming within deep learning is thus an essential…

Machine Learning · Computer Science 2023-03-01 Francesco Alesiani

Bilevel Programming for Hyperparameter Optimization and Meta-Learning

We introduce a framework based on bilevel programming that unifies gradient-based hyperparameter optimization and meta-learning. We show that an approximate version of the bilevel problem can be solved by taking into explicit account the…

Machine Learning · Statistics 2018-07-04 Luca Franceschi , Paolo Frasconi , Saverio Salzo , Riccardo Grazzi , Massimilano Pontil

On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network

Bilevel optimization has been applied to a wide variety of machine learning models, and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the…

Machine Learning · Computer Science 2023-03-28 Hongchang Gao , Bin Gu , My T. Thai

A Gradient Method for Multilevel Optimization

Although application examples of multilevel optimization have already been discussed since the 1990s, the development of solution methods was almost limited to bilevel cases due to the difficulty of the problem. In recent years, in machine…

Optimization and Control · Mathematics 2021-10-27 Ryo Sato , Mirai Tanaka , Akiko Takeda

Bilevel Learning via Inexact Stochastic Gradient Descent

Bilevel optimization is a central tool in machine learning for high-dimensional hyperparameter tuning. Its applications are vast; for instance, in imaging it can be used for learning data-adaptive regularizers and optimizing forward…

Optimization and Control · Mathematics 2025-11-11 Mohammad Sadegh Salehi , Subhadip Mukherjee , Lindon Roberts , Matthias J. Ehrhardt

Beyond backpropagation: bilevel optimization through implicit differentiation and equilibrium propagation

This paper reviews gradient-based techniques to solve bilevel optimization problems. Bilevel optimization is a general way to frame the learning of systems that are implicitly defined through a quantity that they minimize. This…

Machine Learning · Computer Science 2023-05-26 Nicolas Zucchet , João Sacramento

A multi-stage deep learning based algorithm for multiscale modelreduction

In this work, we propose a multi-stage training strategy for the development of deep learning algorithms applied to problems with multiscale features. Each stage of the pro-posed strategy shares an (almost) identical network structure and…

Numerical Analysis · Mathematics 2020-09-25 Eric Chung , Wing Tat Leung , Sai-Mang Pun , Zecheng Zhang

Effective Bilevel Optimization via Minimax Reformulation

Bilevel optimization has found successful applications in various machine learning problems, including hyper-parameter optimization, data cleaning, and meta-learning. However, its huge computational cost presents a significant challenge for…

Machine Learning · Computer Science 2024-11-05 Xiaoyu Wang , Rui Pan , Renjie Pi , Jipeng Zhang

Multi-Level Stochastic Gradient Methods for Nested Composition Optimization

Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are one- or two-level…

Optimization and Control · Mathematics 2018-01-15 Shuoguang Yang , Mengdi Wang , Ethan X. Fang

A Bridge Between Hyperparameter Optimization and Learning-to-learn

We consider a class of a nested optimization problems involving inner and outer objectives. We observe that by taking into explicit account the optimization dynamics for the inner objective it is possible to derive a general framework that…

Machine Learning · Statistics 2019-08-22 Luca Franceschi , Michele Donini , Paolo Frasconi , Massimiliano Pontil

On Stability and Generalization of Bilevel Optimization Problem

(Stochastic) bilevel optimization is a frequently encountered problem in machine learning with a wide range of applications such as meta-learning, hyper-parameter optimization, and reinforcement learning. Most of the existing studies on…

Machine Learning · Computer Science 2023-03-16 Meng Ding , Mingxi Lei , Yunwen Lei , Di Wang , Jinhui Xu

Efficient Curvature-Aware Hypergradient Approximation for Bilevel Optimization

Bilevel optimization is a powerful tool for many machine learning problems, such as hyperparameter optimization and meta-learning. Estimating hypergradients (also known as implicit gradients) is crucial for developing gradient-based methods…

Optimization and Control · Mathematics 2025-05-06 Youran Dong , Junfeng Yang , Wei Yao , Jin Zhang

Adaptive Training Distributions with Scalable Online Bilevel Optimization

Large neural networks pretrained on web-scale corpora are central to modern machine learning. In this paradigm, the distribution of the large, heterogeneous pretraining data rarely matches that of the application domain. This work considers…

Machine Learning · Computer Science 2023-11-21 David Grangier , Pierre Ablin , Awni Hannun

Bilevel optimization for learning hyperparameters: Application to solving PDEs and inverse problems with Gaussian processes

Methods for solving scientific computing and inference problems, such as kernel- and neural network-based approaches for partial differential equations (PDEs), inverse problems, and supervised learning tasks, depend crucially on the choice…

Machine Learning · Statistics 2025-10-08 Nicholas H. Nelsen , Houman Owhadi , Andrew M. Stuart , Xianjin Yang , Zongren Zou

Empirically Accelerating Scaled Gradient Projection Using Deep Neural Network For Inverse Problems In Image Processing

Recently, deep neural networks (DNNs) have shown advantages in accelerating optimization algorithms. One approach is to unfold finite number of iterations of conventional optimization algorithms and to learn parameters in the algorithms.…

Machine Learning · Computer Science 2021-04-23 Byung Hyun Lee , Se Young Chun

Bilevel Optimization for Machine Learning: Algorithm Design and Convergence Analysis

Bilevel optimization has become a powerful framework in various machine learning applications including meta-learning, hyperparameter optimization, and network architecture search. There are generally two classes of bilevel optimization…

Machine Learning · Computer Science 2021-08-03 Kaiyi Ji