Related papers: A Fully First-Order Layer for Differentiable Optim…

Bridging Constraints and Stochasticity: A Fully First-Order Method for Stochastic Bilevel Optimization with Linear Constraints

This work provides the first finite-time convergence guarantees for linearly constrained stochastic bilevel optimization using only first-order methods, requiring solely gradient information without any Hessian computations or second-order…

Optimization and Control · Mathematics 2025-11-18 Cac Phan , Kai Wang

First-Order Methods for Linearly Constrained Bilevel Optimization

Algorithms for bilevel optimization often encounter Hessian computations, which are prohibitive in high dimensions. While recent works offer first-order methods for unconstrained bilevel problems, the constrained setting remains relatively…

Optimization and Control · Mathematics 2025-04-22 Guy Kornowski , Swati Padmanabhan , Kai Wang , Zhe Zhang , Suvrit Sra

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO…

Machine Learning · Computer Science 2022-09-20 Mao Ye , Bo Liu , Stephen Wright , Peter Stone , Qiang Liu

Fully First-Order Algorithms for Online Bilevel Optimization

In this work, we study nonconvex-strongly convex online bilevel optimization (OBO) using only first-order oracle. Existing OBO algorithms are mainly based on hypergradient descent, which requires access to a Hessian-vector product (HVP)…

Machine Learning · Computer Science 2026-05-12 Tingkai Jia , Cheng Chen

A Single-Loop First-Order Algorithm for Linearly Constrained Bilevel Optimization

We study bilevel optimization problems where the lower-level problems are strongly convex and have coupled linear constraints. To overcome the potential non-smoothness of the hyper-objective and the computational challenges associated with…

Optimization and Control · Mathematics 2026-02-06 Wei Shen , Jiawei Zhang , Minhui Huang , Cong Shen

A Simple First-Order Algorithm for Full-Rank Equality Constrained Optimization

A very simple first-order algorithm is proposed for solving nonlinear optimization problems with deterministic nonlinear equality constraints. This algorithm adaptively selects steps in the plane tangent to the constraints or steps that…

Optimization and Control · Mathematics 2026-03-11 Serge Gratton , Philippe L. Toint

First-Order Preconditioning via Hypergradient Descent

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a…

Machine Learning · Computer Science 2020-04-29 Ted Moskovitz , Rui Wang , Janice Lan , Sanyam Kapoor , Thomas Miconi , Jason Yosinski , Aditya Rawal

A Fully First-Order Method for Stochastic Bilevel Optimization

We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend…

Optimization and Control · Mathematics 2023-01-27 Jeongyeol Kwon , Dohyun Kwon , Stephen Wright , Robert Nowak

Differentially Private Bilevel Optimization

We present differentially private (DP) algorithms for bilevel optimization, a problem class that received significant attention lately in various machine learning applications. These are the first algorithms for such problems under standard…

Machine Learning · Computer Science 2026-01-15 Guy Kornowski

Second-Order Sensitivity Analysis for Bilevel Optimization

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper"…

Optimization and Control · Mathematics 2022-05-06 Robert Dyro , Edward Schmerling , Nikos Arechiga , Marco Pavone

First order online optimisation using forward gradients in over-parameterised systems

The success of deep learning over the past decade mainly relies on gradient-based optimisation and backpropagation. This paper focuses on analysing the performance of first-order gradient-based optimisation algorithms, gradient descent and…

Optimization and Control · Mathematics 2022-12-08 Behnam Mafakheri , Iman Shames , Jonathan H. Manton

Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles

In this work, we consider bilevel optimization when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product (HVP) oracle, one can provably find an $\epsilon$-stationary point within…

Optimization and Control · Mathematics 2026-05-26 Lesi Chen , Yaohua Ma , Jingzhao Zhang

On the Convergence Theory for Hessian-Free Bilevel Algorithms

Bilevel optimization has arisen as a powerful tool in modern machine learning. However, due to the nested structure of bilevel optimization, even gradient-based methods require second-order derivative approximations via Jacobian- or/and…

Machine Learning · Computer Science 2022-06-07 Daouda Sow , Kaiyi Ji , Yingbin Liang

Towards Differentiable Multilevel Optimization: A Gradient-Based Approach

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

Level Constrained First Order Methods for Function Constrained Optimization

We present a new feasible proximal gradient method for constrained optimization where both the objective and constraint functions are given by the summation of a smooth, possibly nonconvex function and a convex simple function. The…

Optimization and Control · Mathematics 2024-02-01 Digvijay Boob , Qi Deng , Guanghui Lan

Gathering and Exploiting Higher-Order Information when Training Large Structured Models

When training large models, such as neural networks, the full derivatives of order 2 and beyond are usually inaccessible, due to their computational cost. Therefore, among the second-order optimization methods, it is common to bypass the…

Machine Learning · Computer Science 2025-10-01 Pierre Wolinski

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but…

Machine Learning · Statistics 2022-08-10 Quentin Bertrand , Quentin Klopfenstein , Mathurin Massias , Mathieu Blondel , Samuel Vaiter , Alexandre Gramfort , Joseph Salmon

Oracle Complexity of Second-Order Methods for Finite-Sum Problems

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in \emph{second-order} methods, which rely…

Optimization and Control · Mathematics 2017-03-09 Yossi Arjevani , Ohad Shamir

Low-Order Explicit Hessian Imitation Method for Large-Scale Supervised Machine Learning

An algorithm is proposed for solving optimization problems arising in neural network training for supervised learning. The unique feature of the algorithm is the use of an auxiliary loss, in addition to the original loss employed for model…

Optimization and Control · Mathematics 2026-05-11 Yunlang Zhu , Lingjun Guo , Zahra Khatti , Xiaoyi Qu , Chia-Yuan Wu , Lara Zebiane , Frank E. Curtis

A Communication and Computation Efficient Fully First-order Method for Decentralized Bilevel Optimization

Bilevel optimization, crucial for hyperparameter tuning, meta-learning and reinforcement learning, remains less explored in the decentralized learning paradigm, such as decentralized federated learning (DFL). Typically, decentralized…

Machine Learning · Computer Science 2024-10-21 Min Wen , Chengchang Liu , Ahmed Abdelmoniem , Yipeng Zhou , Yuedong Xu