Related papers: A stochastic gradient method for trilevel optimiza…

Inexact bilevel stochastic gradient methods for constrained and unconstrained lower-level problems

Two-level stochastic optimization formulations have become instrumental in a number of machine learning contexts such as continual learning, neural architecture search, adversarial learning, and hyperparameter tuning. Practical stochastic…

Optimization and Control · Mathematics 2023-11-08 Tommaso Giovannelli , Griffin Dean Kent , Luis Nunes Vicente

A Gradient Method for Multilevel Optimization

Although application examples of multilevel optimization have already been discussed since the 1990s, the development of solution methods was almost limited to bilevel cases due to the difficulty of the problem. In recent years, in machine…

Optimization and Control · Mathematics 2021-10-27 Ryo Sato , Mirai Tanaka , Akiko Takeda

Multi-Level Stochastic Gradient Methods for Nested Composition Optimization

Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are one- or two-level…

Optimization and Control · Mathematics 2018-01-15 Shuoguang Yang , Mengdi Wang , Ethan X. Fang

Bilevel Learning via Inexact Stochastic Gradient Descent

Bilevel optimization is a central tool in machine learning for high-dimensional hyperparameter tuning. Its applications are vast; for instance, in imaging it can be used for learning data-adaptive regularizers and optimizing forward…

Optimization and Control · Mathematics 2025-11-11 Mohammad Sadegh Salehi , Subhadip Mukherjee , Lindon Roberts , Matthias J. Ehrhardt

Towards Differentiable Multilevel Optimization: A Gradient-Based Approach

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

Convergence Properties of Stochastic Hypergradients

Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning. A key step to tackle these problems is the efficient computation of…

Machine Learning · Statistics 2025-05-20 Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo

Efficient Gradient Approximation Method for Constrained Bilevel Optimization

Bilevel optimization has been developed for many machine learning tasks with large-scale and high-dimensional data. This paper considers a constrained bilevel optimization problem, where the lower-level optimization problem is convex with…

Machine Learning · Computer Science 2023-08-22 Siyuan Xu , Minghui Zhu

A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization

A number of optimization approaches have been proposed for optimizing nonconvex objectives (e.g. deep learning models), such as batch gradient descent, stochastic gradient descent and stochastic variance reduced gradient descent. Theory…

Machine Learning · Computer Science 2019-05-15 Jia Bi , Steve R. Gunn

On the Convergence of Distributed Stochastic Bilevel Optimization Algorithms over a Network

Bilevel optimization has been applied to a wide variety of machine learning models, and numerous stochastic bilevel optimization algorithms have been developed in recent years. However, most existing algorithms restrict their focus on the…

Machine Learning · Computer Science 2023-03-28 Hongchang Gao , Bin Gu , My T. Thai

Bilevel Learning with Inexact Stochastic Gradients

Bilevel learning has gained prominence in machine learning, inverse problems, and imaging applications, including hyperparameter optimization, learning data-adaptive regularizers, and optimizing forward operators. The large-scale nature of…

Optimization and Control · Mathematics 2025-05-20 Mohammad Sadegh Salehi , Subhadip Mukherjee , Lindon Roberts , Matthias J. Ehrhardt

On the Communication Complexity of Decentralized Stochastic Bilevel Optimization

Stochastic bilevel optimization finds widespread applications in machine learning, including meta-learning, hyperparameter optimization, and neural architecture search. To extend stochastic bilevel optimization to distributed data, several…

Machine Learning · Computer Science 2026-05-26 Yihan Zhang , My T. Thai , Jie Wu , Hongchang Gao

Multilevel Stochastic Gradient Descent for Optimal Control Under Uncertainty

We present a multilevel stochastic gradient descent method for the optimal control of systems governed by partial differential equations under uncertain input data. The gradient descent method used to find the optimal control leverages a…

Optimization and Control · Mathematics 2025-06-04 Niklas Baumgarten , David Schneiderhan

Bridging Constraints and Stochasticity: A Fully First-Order Method for Stochastic Bilevel Optimization with Linear Constraints

This work provides the first finite-time convergence guarantees for linearly constrained stochastic bilevel optimization using only first-order methods, requiring solely gradient information without any Hessian computations or second-order…

Optimization and Control · Mathematics 2025-11-18 Cac Phan , Kai Wang

Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noise

Existing decentralized stochastic optimization methods assume the lower-level loss function is strongly convex and the stochastic gradient noise has finite variance. These strong assumptions typically are not satisfied in real-world machine…

Machine Learning · Computer Science 2026-05-26 Xinwen Zhang , Yihan Zhang , Heng Liang , Hongchang Gao

A Doubly Stochastically Perturbed Algorithm for Linearly Constrained Bilevel Optimization

In this work, we develop analysis and algorithms for a class of (stochastic) bilevel optimization problems whose lower-level (LL) problem is strongly convex and linearly constrained. Most existing approaches for solving such problems rely…

Optimization and Control · Mathematics 2025-04-08 Prashant Khanduri , Ioannis Tsaknakis , Yihua Zhang , Sijia Liu , Mingyi Hong

An Augmented Lagrangian Value Function Method for Lower-level Constrained Stochastic Bilevel Optimization

Recently, lower-level constrained bilevel optimization has attracted increasing attention. However, existing methods mostly focus on either deterministic cases or problems with linear constraints. The main challenge in stochastic cases with…

Optimization and Control · Mathematics 2025-10-13 Hantao Nie , Jiaxiang Li , Zaiwen Wen

Bilevel optimization with a multi-objective lower-level problem: Risk-neutral and risk-averse formulations

In this work, we propose different formulations and gradient-based algorithms for deterministic and stochastic bilevel problems with conflicting objectives in the lower level. Such problems have received little attention in the…

Optimization and Control · Mathematics 2023-11-08 Tommaso Giovannelli , Griffin Dean Kent , Luis Nunes Vicente

Adaptive Algorithms with Sharp Convergence Rates for Stochastic Hierarchical Optimization

Hierarchical optimization refers to problems with interdependent decision variables and objectives, such as minimax and bilevel formulations. While various algorithms have been proposed, existing methods and analyses lack adaptivity in…

Machine Learning · Computer Science 2025-10-27 Xiaochuan Gong , Jie Hao , Mingrui Liu

A Conditional Gradient-based Method for Simple Bilevel Optimization with Convex Lower-level Problem

In this paper, we study a class of bilevel optimization problems, also known as simple bilevel optimization, where we minimize a smooth objective function over the optimal solution set of another convex constrained optimization problem.…

Optimization and Control · Mathematics 2023-04-25 Ruichen Jiang , Nazanin Abolfazli , Aryan Mokhtari , Erfan Yazdandoost Hamedani

Stochastic Subspace Descent

We present two stochastic descent algorithms that apply to unconstrained optimization and are particularly efficient when the objective function is slow to evaluate and gradients are not easily obtained, as in some PDE-constrained…

Optimization and Control · Mathematics 2019-04-30 David Kozak , Stephen Becker , Alireza Doostan , Luis Tenorio