English
Related papers

Related papers: A Fully First-Order Layer for Differentiable Optim…

200 papers

This work provides the first finite-time convergence guarantees for linearly constrained stochastic bilevel optimization using only first-order methods, requiring solely gradient information without any Hessian computations or second-order…

Optimization and Control · Mathematics 2025-11-18 Cac Phan , Kai Wang

Algorithms for bilevel optimization often encounter Hessian computations, which are prohibitive in high dimensions. While recent works offer first-order methods for unconstrained bilevel problems, the constrained setting remains relatively…

Optimization and Control · Mathematics 2025-04-22 Guy Kornowski , Swati Padmanabhan , Kai Wang , Zhe Zhang , Suvrit Sra

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning. Conventional BO…

Machine Learning · Computer Science 2022-09-20 Mao Ye , Bo Liu , Stephen Wright , Peter Stone , Qiang Liu

In this work, we study nonconvex-strongly convex online bilevel optimization (OBO) using only first-order oracle. Existing OBO algorithms are mainly based on hypergradient descent, which requires access to a Hessian-vector product (HVP)…

Machine Learning · Computer Science 2026-05-12 Tingkai Jia , Cheng Chen

We study bilevel optimization problems where the lower-level problems are strongly convex and have coupled linear constraints. To overcome the potential non-smoothness of the hyper-objective and the computational challenges associated with…

Optimization and Control · Mathematics 2026-02-06 Wei Shen , Jiawei Zhang , Minhui Huang , Cong Shen

A very simple first-order algorithm is proposed for solving nonlinear optimization problems with deterministic nonlinear equality constraints. This algorithm adaptively selects steps in the plane tangent to the constraints or steps that…

Optimization and Control · Mathematics 2026-03-11 Serge Gratton , Philippe L. Toint

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space.These difficulties can be addressed by second-order approaches that apply a…

Machine Learning · Computer Science 2020-04-29 Ted Moskovitz , Rui Wang , Janice Lan , Sanyam Kapoor , Thomas Miconi , Jason Yosinski , Aditya Rawal

We consider stochastic unconstrained bilevel optimization problems when only the first-order gradient oracles are available. While numerous optimization methods have been proposed for tackling bilevel problems, existing methods either tend…

Optimization and Control · Mathematics 2023-01-27 Jeongyeol Kwon , Dohyun Kwon , Stephen Wright , Robert Nowak

We present differentially private (DP) algorithms for bilevel optimization, a problem class that received significant attention lately in various machine learning applications. These are the first algorithms for such problems under standard…

Machine Learning · Computer Science 2026-01-15 Guy Kornowski

In this work we derive a second-order approach to bilevel optimization, a type of mathematical programming in which the solution to a parameterized optimization problem (the "lower" problem) is itself to be optimized (in the "upper"…

Optimization and Control · Mathematics 2022-05-06 Robert Dyro , Edward Schmerling , Nikos Arechiga , Marco Pavone

The success of deep learning over the past decade mainly relies on gradient-based optimisation and backpropagation. This paper focuses on analysing the performance of first-order gradient-based optimisation algorithms, gradient descent and…

Optimization and Control · Mathematics 2022-12-08 Behnam Mafakheri , Iman Shames , Jonathan H. Manton

In this work, we consider bilevel optimization when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product (HVP) oracle, one can provably find an $\epsilon$-stationary point within…

Optimization and Control · Mathematics 2026-05-26 Lesi Chen , Yaohua Ma , Jingzhao Zhang

Bilevel optimization has arisen as a powerful tool in modern machine learning. However, due to the nested structure of bilevel optimization, even gradient-based methods require second-order derivative approximations via Jacobian- or/and…

Machine Learning · Computer Science 2022-06-07 Daouda Sow , Kaiyi Ji , Yingbin Liang

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

We present a new feasible proximal gradient method for constrained optimization where both the objective and constraint functions are given by the summation of a smooth, possibly nonconvex function and a convex simple function. The…

Optimization and Control · Mathematics 2024-02-01 Digvijay Boob , Qi Deng , Guanghui Lan

When training large models, such as neural networks, the full derivatives of order 2 and beyond are usually inaccessible, due to their computational cost. Therefore, among the second-order optimization methods, it is common to bypass the…

Machine Learning · Computer Science 2025-10-01 Pierre Wolinski

Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but…

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in \emph{second-order} methods, which rely…

Optimization and Control · Mathematics 2017-03-09 Yossi Arjevani , Ohad Shamir

An algorithm is proposed for solving optimization problems arising in neural network training for supervised learning. The unique feature of the algorithm is the use of an auxiliary loss, in addition to the original loss employed for model…

Optimization and Control · Mathematics 2026-05-11 Yunlang Zhu , Lingjun Guo , Zahra Khatti , Xiaoyi Qu , Chia-Yuan Wu , Lara Zebiane , Frank E. Curtis

Bilevel optimization, crucial for hyperparameter tuning, meta-learning and reinforcement learning, remains less explored in the decentralized learning paradigm, such as decentralized federated learning (DFL). Typically, decentralized…

Machine Learning · Computer Science 2024-10-21 Min Wen , Chengchang Liu , Ahmed Abdelmoniem , Yipeng Zhou , Yuedong Xu
‹ Prev 1 2 3 10 Next ›