Related papers: Implicit differentiation of Lasso-type models for …

Online Hyperparameter Search Interleaved with Proximal Parameter Updates

There is a clear need for efficient algorithms to tune hyperparameters for statistical learning schemes, since the commonly applied search methods (such as grid search with N-fold cross-validation) are inefficient and/or approximate.…

Machine Learning · Computer Science 2020-04-07 Luis Miguel Lopez-Ramos , Baltasar Beferull-Lozano

Gradient-based bilevel optimization for multi-penalty Ridge regression through matrix differential calculus

Common regularization algorithms for linear regression, such as LASSO and Ridge regression, rely on a regularization hyperparameter that balances the tradeoff between minimizing the fitting error and the norm of the learned model…

Machine Learning · Computer Science 2023-11-27 Gabriele Maroni , Loris Cannelli , Dario Piga

Iterative Implicit Gradients for Nonconvex Optimization with Variational Inequality Constraints

We propose an optimization proxy in terms of iterative implicit gradient methods for solving constrained optimization problems with nonconvex loss functions. This framework can be applied to a broad range of machine learning settings,…

Optimization and Control · Mathematics 2025-10-14 Harshal D. Kaushik , Ming Jin

Linear Discriminant Analysis with Gradient Optimization

Linear discriminant analysis (LDA) is a fundamental classification and dimension reduction method that achieves Bayes optimality under Gaussian mixture, but often struggles in high-dimensional settings where the covariance matrix cannot be…

Computation · Statistics 2026-04-06 Cencheng Shen , Yuexiao Dong

A Doubly Stochastically Perturbed Algorithm for Linearly Constrained Bilevel Optimization

In this work, we develop analysis and algorithms for a class of (stochastic) bilevel optimization problems whose lower-level (LL) problem is strongly convex and linearly constrained. Most existing approaches for solving such problems rely…

Optimization and Control · Mathematics 2025-04-08 Prashant Khanduri , Ioannis Tsaknakis , Yihua Zhang , Sijia Liu , Mingyi Hong

Optimizing Millions of Hyperparameters by Implicit Differentiation

We propose an algorithm for inexpensive gradient-based hyperparameter optimization that combines the implicit function theorem (IFT) with efficient inverse Hessian approximations. We present results about the relationship between the IFT…

Machine Learning · Computer Science 2019-11-11 Jonathan Lorraine , Paul Vicol , David Duvenaud

Asymptotic and finite-sample properties of estimators based on stochastic gradients

Stochastic gradient descent procedures have gained popularity for parameter estimation from large data sets. However, their statistical properties are not well understood, in theory. And in practice, avoiding numerical instability requires…

Methodology · Statistics 2016-09-29 Panos Toulis , Edoardo M. Airoldi

Amortized Implicit Differentiation for Stochastic Bilevel Optimization

We study a class of algorithms for solving bilevel optimization problems in both stochastic and deterministic settings when the inner-level objective is strongly convex. Specifically, we consider algorithms based on inexact implicit…

Optimization and Control · Mathematics 2022-07-12 Michael Arbel , Julien Mairal

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Finding the optimal hyperparameters of a model can be cast as a bilevel optimization problem, typically solved using zero-order techniques. In this work we study first-order methods when the inner optimization problem is convex but…

Machine Learning · Statistics 2022-08-10 Quentin Bertrand , Quentin Klopfenstein , Mathurin Massias , Mathieu Blondel , Samuel Vaiter , Alexandre Gramfort , Joseph Salmon

Towards Differentiable Multilevel Optimization: A Gradient-Based Approach

Multilevel optimization has gained renewed interest in machine learning due to its promise in applications such as hyperparameter tuning and continual learning. However, existing methods struggle with the inherent difficulty of efficiently…

Machine Learning · Computer Science 2024-10-16 Yuntian Gu , Xuzheng Chen

Convergence Properties of Stochastic Hypergradients

Bilevel optimization problems are receiving increasing attention in machine learning as they provide a natural framework for hyperparameter optimization and meta-learning. A key step to tackle these problems is the efficient computation of…

Machine Learning · Statistics 2025-05-20 Riccardo Grazzi , Massimiliano Pontil , Saverio Salzo

An Alternative Graphical Lasso Algorithm for Precision Matrices

The Graphical Lasso (GLasso) algorithm is fast and widely used for estimating sparse precision matrices (Friedman et al., 2008). Its central role in the literature of high-dimensional covariance estimation rivals that of Lasso regression…

Computation · Statistics 2024-03-20 Aramayis Dallakyan , Mohsen Pourahmadi

Analyzing Inexact Hypergradients for Bilevel Learning

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters…

Optimization and Control · Mathematics 2023-11-16 Matthias J. Ehrhardt , Lindon Roberts

Nystrom Method for Accurate and Scalable Implicit Differentiation

The essential difficulty of gradient-based bilevel optimization using implicit differentiation is to estimate the inverse Hessian vector product with respect to neural network parameters. This paper proposes to tackle this problem by the…

Machine Learning · Computer Science 2023-02-21 Ryuichiro Hataya , Makoto Yamada

Convex vs nonconvex approaches for sparse estimation: GLasso, Multiple Kernel Learning and Hyperparameter GLasso

The popular Lasso approach for sparse estimation can be derived via marginalization of a joint density associated with a particular stochastic model. A different marginalization of the same probabilistic model leads to a different…

Machine Learning · Statistics 2013-02-28 Aleksandr Y. Aravkin , James V. Burke , Alessandro Chiuso , Gianluigi Pillonetto

Gradient Estimators for Implicit Models

Implicit models, which allow for the generation of samples but not for point-wise evaluation of probabilities, are omnipresent in real-world problems tackled by machine learning and a hot topic of current research. Some examples include…

Machine Learning · Statistics 2018-04-27 Yingzhen Li , Richard E. Turner

A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization

A number of optimization approaches have been proposed for optimizing nonconvex objectives (e.g. deep learning models), such as batch gradient descent, stochastic gradient descent and stochastic variance reduced gradient descent. Theory…

Machine Learning · Computer Science 2019-05-15 Jia Bi , Steve R. Gunn

Parameter estimation by implicit sampling

Implicit sampling is a weighted sampling method that is used in data assimilation, where one sequentially updates estimates of the state of a stochastic model based on a stream of noisy or incomplete data. Here we describe how to use…

Numerical Analysis · Mathematics 2016-01-20 Matthias Morzfeld , Xuemin Tu , Jon Wilkening , Alexandre J. Chorin

High-Dimensional Linear Regression via Implicit Regularization

Many statistical estimators for high-dimensional linear regression are M-estimators, formed through minimizing a data-dependent square loss function plus a regularizer. This work considers a new class of estimators implicitly defined…

Statistics Theory · Mathematics 2022-02-15 Peng Zhao , Yun Yang , Qiao-Chu He

Implicit Rate-Constrained Optimization of Non-decomposable Objectives

We consider a popular family of constrained optimization problems arising in machine learning that involve optimizing a non-decomposable evaluation metric with a certain thresholded form, while constraining another metric of interest.…

Machine Learning · Computer Science 2021-07-30 Abhishek Kumar , Harikrishna Narasimhan , Andrew Cotter