Related papers: Memory-Efficient Gradient Unrolling for Large-Scal…

UFO-BLO: Unbiased First-Order Bilevel Optimization

Bilevel optimization (BLO) is a popular approach with many applications including hyperparameter optimization, neural architecture search, adversarial robustness and model-agnostic meta-learning. However, the approach suffers from time and…

Machine Learning · Computer Science 2021-06-08 Valerii Likhosherstov , Xingyou Song , Krzysztof Choromanski , Jared Davis , Adrian Weller

Efficient Bilevel Optimization with KFAC-Based Hypergradients

Bilevel optimization (BO) is widely applicable to many machine learning problems. Scaling BO, however, requires repeatedly computing hypergradients, which involves solving inverse Hessian-vector products (IHVPs). In practice, these…

Machine Learning · Computer Science 2026-04-01 Disen Liao , Felix Dangel , Yaoliang Yu

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO…

Machine Learning · Computer Science 2022-09-20 Mao Ye , Bo Liu , Stephen Wright , Peter Stone , Qiang Liu

Provably Faster Algorithms for Bilevel Optimization

Bilevel optimization has been widely applied in many important machine learning applications such as hyperparameter optimization and meta-learning. Recently, several momentum-based algorithms have been proposed to solve bilevel optimization…

Machine Learning · Computer Science 2021-12-17 Junjie Yang , Kaiyi Ji , Yingbin Liang

Deep Learning for Two-Stage Robust Integer Optimization

Robust optimization is an established framework for modeling optimization problems with uncertain parameters. While static robust optimization is often criticized for being too conservative, two-stage (or adjustable) robust optimization…

Optimization and Control · Mathematics 2024-11-05 Justin Dumouchelle , Esther Julien , Jannis Kurtz , Elias B. Khalil

A General Descent Aggregation Framework for Gradient-based Bi-level Optimization

In recent years, a variety of gradient-based methods have been developed to solve Bi-Level Optimization (BLO) problems in machine learning and computer vision areas. However, the theoretical correctness and practical effectiveness of these…

Machine Learning · Computer Science 2022-01-04 Risheng Liu , Pan Mu , Xiaoming Yuan , Shangzhi Zeng , Jin Zhang

Bilevel learning

Bilevel learning refers to machine learning problems that can be formulated as bilevel optimization models, where decisions are organized in a hierarchical structure. This paradigm has recently gained considerable attention in machine…

Optimization and Control · Mathematics 2026-05-05 Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo , Alain Zemkoho

Efficient Curvature-Aware Hypergradient Approximation for Bilevel Optimization

Bilevel optimization is a powerful tool for many machine learning problems, such as hyperparameter optimization and meta-learning. Estimating hypergradients (also known as implicit gradients) is crucial for developing gradient-based methods…

Optimization and Control · Mathematics 2025-05-06 Youran Dong , Junfeng Yang , Wei Yao , Jin Zhang

Investigating Bi-Level Optimization for Learning and Vision from a Unified Perspective: A Survey and Beyond

Bi-Level Optimization (BLO) is originated from the area of economic game theory and then introduced into the optimization community. BLO is able to handle problems with a hierarchical structure, involving two levels of optimization tasks,…

Machine Learning · Computer Science 2021-09-29 Risheng Liu , Jiaxin Gao , Jin Zhang , Deyu Meng , Zhouchen Lin

Towards Extremely Fast Bilevel Optimization with Self-governed Convergence Guarantees

Gradient methods have become mainstream techniques for Bi-Level Optimization (BLO) in learning and vision fields. The validity of existing works heavily relies on solving a series of approximation subproblems with extraordinarily high…

Optimization and Control · Mathematics 2022-05-23 Risheng Liu , Xuan Liu , Wei Yao , Shangzhi Zeng , Jin Zhang

Bilevel Learning via Inexact Stochastic Gradient Descent

Bilevel optimization is a central tool in machine learning for high-dimensional hyperparameter tuning. Its applications are vast; for instance, in imaging it can be used for learning data-adaptive regularizers and optimizing forward…

Optimization and Control · Mathematics 2025-11-11 Mohammad Sadegh Salehi , Subhadip Mukherjee , Lindon Roberts , Matthias J. Ehrhardt

Scalable Meta-Learning via Mixed-Mode Differentiation

Gradient-based bilevel optimisation is a powerful technique with applications in hyperparameter optimisation, task adaptation, algorithm discovery, meta-learning more broadly, and beyond. It often requires differentiating through the…

Machine Learning · Computer Science 2025-06-11 Iurii Kemaev , Dan A Calian , Luisa M Zintgraf , Gregory Farquhar , Hado van Hasselt

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory…

Machine Learning · Computer Science 2021-06-09 Valerii Likhosherstov , Xingyou Song , Krzysztof Choromanski , Jared Davis , Adrian Weller

Effective Bilevel Optimization via Minimax Reformulation

Bilevel optimization has found successful applications in various machine learning problems, including hyper-parameter optimization, data cleaning, and meta-learning. However, its huge computational cost presents a significant challenge for…

Machine Learning · Computer Science 2024-11-05 Xiaoyu Wang , Rui Pan , Renjie Pi , Jipeng Zhang

A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning

We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning, including hyperparameter optimization, loss function learning, few-shot learning, invariance learning and more. These problems are often…

Machine Learning · Computer Science 2024-10-15 Minyoung Kim , Timothy M. Hospedales

Implicit Bilevel Optimization: Differentiating through Bilevel Optimization Programming

Bilevel Optimization Programming is used to model complex and conflicting interactions between agents, for example in Robust AI or Privacy-preserving AI. Integrating bilevel mathematical programming within deep learning is thus an essential…

Machine Learning · Computer Science 2023-03-01 Francesco Alesiani

FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

Incremental unlearning (IU) is critical for pre-trained models to comply with sequential data deletion requests, yet existing methods primarily suppress parameters or confuse knowledge without explicit constraints on both feature and…

Machine Learning · Computer Science 2026-01-21 Qian Feng , JiaHang Tu , Mintong Kang , Hanbin Zhao , Chao Zhang , Hui Qian

Bilevel Optimization under Unbounded Smoothness: A New Algorithm and Convergence Analysis

Bilevel optimization is an important formulation for many machine learning problems. Current bilevel optimization algorithms assume that the gradient of the upper-level function is Lipschitz. However, recent studies reveal that certain…

Machine Learning · Computer Science 2024-01-19 Jie Hao , Xiaochuan Gong , Mingrui Liu

Non-Stationary Functional Bilevel Optimization

Functional bilevel optimization (FBO) provides a powerful framework for hierarchical learning in function spaces, yet current methods are limited to static offline settings and perform suboptimally in online, non-stationary scenarios. We…

Machine Learning · Statistics 2026-01-23 Jason Bohne , Ieva Petrulionyte , Michael Arbel , Julien Mairal , Paweł Polak

Gradient-based algorithms for multi-objective bi-level optimization

Multi-Objective Bi-Level Optimization (MOBLO) addresses nested multi-objective optimization problems common in a range of applications. However, its multi-objective and hierarchical bilevel nature makes it notably complex. Gradient-based…

Optimization and Control · Mathematics 2024-06-11 Xinmin Yang , Wei Yao , Haian Yin , Shangzhi Zeng , Jin Zhang