Related papers: Implicit differentiation for fast hyperparameter s…

Iterative Implicit Gradients for Nonconvex Optimization with Variational Inequality Constraints

We propose an optimization proxy in terms of iterative implicit gradient methods for solving constrained optimization problems with nonconvex loss functions. This framework can be applied to a broad range of machine learning settings,…

Optimization and Control · Mathematics 2025-10-14 Harshal D. Kaushik , Ming Jin

One-step differentiation of iterative algorithms

In appropriate frameworks, automatic differentiation is transparent to the user at the cost of being a significant computational burden when the number of operations is large. For iterative algorithms, implicit differentiation alleviates…

Optimization and Control · Mathematics 2023-05-24 Jérôme Bolte , Edouard Pauwels , Samuel Vaiter

You Shall Pass: Dealing with the Zero-Gradient Problem in Predict and Optimize for Convex Optimization

Predict and optimize is an increasingly popular decision-making paradigm that employs machine learning to predict unknown parameters of optimization problems. Instead of minimizing the prediction error of the parameters, it trains…

Machine Learning · Computer Science 2024-02-05 Grigorii Veviurko , Wendelin Böhmer , Mathijs de Weerdt

First-Order Methods for Convex Optimization

First-order methods for solving convex optimization problems have been at the forefront of mathematical optimization in the last 20 years. The rapid development of this important class of algorithms is motivated by the success stories…

Optimization and Control · Mathematics 2021-01-07 Pavel Dvurechensky , Mathias Staudigl , Shimrit Shtern

Exploring Jacobian Inexactness in Second-Order Methods for Variational Inequalities: Lower Bounds, Optimal Algorithms and Quasi-Newton Approximations

Variational inequalities represent a broad class of problems, including minimization and min-max problems, commonly found in machine learning. Existing second-order and high-order methods for variational inequalities require precise…

Optimization and Control · Mathematics 2024-05-29 Artem Agafonov , Petr Ostroukhov , Roman Mozhaev , Konstantin Yakovlev , Eduard Gorbunov , Martin Takáč , Alexander Gasnikov , Dmitry Kamzolov

Analyzing Inexact Hypergradients for Bilevel Learning

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters…

Optimization and Control · Mathematics 2023-11-16 Matthias J. Ehrhardt , Lindon Roberts

Amortized Implicit Differentiation for Stochastic Bilevel Optimization

We study a class of algorithms for solving bilevel optimization problems in both stochastic and deterministic settings when the inner-level objective is strongly convex. Specifically, we consider algorithms based on inexact implicit…

Optimization and Control · Mathematics 2022-07-12 Michael Arbel , Julien Mairal

Primal subgradient methods with predefined stepsizes

In this paper, we suggest a new framework for analyzing primal subgradient methods for nonsmooth convex optimization problems. We show that the classical step-size rules, based on normalization of subgradient, or on the knowledge of optimal…

Optimization and Control · Mathematics 2023-11-27 Yurii Nesterov

Data-driven nonsmooth optimization

In this work, we consider methods for solving large-scale optimization problems with a possibly nonsmooth objective function. The key idea is to first specify a class of optimization algorithms using a generic iterative scheme involving…

Optimization and Control · Mathematics 2020-02-19 Sebastian Banert , Axel Ringh , Jonas Adler , Johan Karlsson , Ozan Öktem

Linearly convergent bilevel optimization with single-step inner methods

We propose a new approach to solving bilevel optimization problems, intermediate between solving full-system optimality conditions with a Newton-type approach, and treating the inner problem as an implicit function. The overall idea is to…

Optimization and Control · Mathematics 2024-05-08 Ensio Suonperä , Tuomo Valkonen

Estimation from Indirect Supervision with Linear Moments

In structured prediction problems where we have indirect supervision of the output, maximum marginal likelihood faces two computational obstacles: non-convexity of the objective and intractability of even a single gradient computation. In…

Machine Learning · Statistics 2016-08-11 Aditi Raghunathan , Roy Frostig , John Duchi , Percy Liang

Nonsmooth Implicit Differentiation for Machine Learning and Optimization

In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus. Our result applies to most practical problems (i.e., definable problems) provided that a…

Machine Learning · Computer Science 2022-04-06 Jérôme Bolte , Tam Le , Edouard Pauwels , Antonio Silveti-Falls

Accelerated first-order methods for large-scale convex minimization

This paper discusses several (sub)gradient methods attaining the optimal complexity for smooth problems with Lipschitz continuous gradients, nonsmooth problems with bounded variation of subgradients, weakly smooth problems with H\"older…

Optimization and Control · Mathematics 2016-05-02 Masoud Ahookhosh

A Fast and Convergent Proximal Algorithm for Regularized Nonconvex and Nonsmooth Bi-level Optimization

Many important machine learning applications involve regularized nonconvex bi-level optimization. However, the existing gradient-based bi-level optimization algorithms cannot handle nonconvex or nonsmooth regularizers, and they suffer from…

Machine Learning · Computer Science 2022-06-06 Ziyi Chen , Bhavya Kailkhura , Yi Zhou

On the Convergence Theory for Hessian-Free Bilevel Algorithms

Bilevel optimization has arisen as a powerful tool in modern machine learning. However, due to the nested structure of bilevel optimization, even gradient-based methods require second-order derivative approximations via Jacobian- or/and…

Machine Learning · Computer Science 2022-06-07 Daouda Sow , Kaiyi Ji , Yingbin Liang

On Tackling High-Dimensional Nonconvex Stochastic Optimization via Stochastic First-Order Methods with Non-smooth Proximal Terms and Variance Reduction

When the nonconvex problem is complicated by stochasticity, the sample complexity of stochastic first-order methods may depend linearly on the problem dimension, which is undesirable for large-scale problems. To alleviate this linear…

Optimization and Control · Mathematics 2025-09-30 Yue Xie , Jiawen Bi , Hongcheng Liu

First order online optimisation using forward gradients in over-parameterised systems

The success of deep learning over the past decade mainly relies on gradient-based optimisation and backpropagation. This paper focuses on analysing the performance of first-order gradient-based optimisation algorithms, gradient descent and…

Optimization and Control · Mathematics 2022-12-08 Behnam Mafakheri , Iman Shames , Jonathan H. Manton

Adaptive First-and Zeroth-order Methods for Weakly Convex Stochastic Optimization Problems

In this paper, we design and analyze a new family of adaptive subgradient methods for solving an important class of weakly convex (possibly nonsmooth) stochastic optimization problems. Adaptive methods that use exponential moving averages…

Optimization and Control · Mathematics 2020-05-26 Parvin Nazari , Davoud Ataee Tarzanagh , George Michailidis

A First Order Method for Solving Convex Bi-Level Optimization Problems

In this paper we study convex bi-level optimization problems for which the inner level consists of minimization of the sum of smooth and nonsmooth functions. The outer level aims at minimizing a smooth and strongly convex function over the…

Optimization and Control · Mathematics 2017-02-15 Shoham Sabach , Shimrit Shtern

Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant

This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and…

Machine Learning · Computer Science 2023-11-14 Xianbiao Qi , Jianan Wang , Lei Zhang