Related papers: Provably Correct Automatic Subdifferentiation for …

Differentiating a Tensor Language

How does one compile derivatives of tensor programs, such that the resulting code is purely functional (hence easier to optimize and parallelize) and provably efficient relative to the original program? We show that naively differentiating…

Programming Languages · Computer Science 2020-10-01 Gilbert Bernstein , Michael Mara , Tzu-Mao Li , Dougal Maclaurin , Jonathan Ragan-Kelley

Efficient Calculation of Regular Simplex Gradients

Simplex gradients are an essential feature of many derivative free optimization algorithms, and can be employed, for example, as part of the process of defining a direction of search, or as part of a termination criterion. The calculation…

Optimization and Control · Mathematics 2018-07-26 Ian Coope , Rachael Tappenden

On the complexity of nonsmooth automatic differentiation

Using the notion of conservative gradient, we provide a simple model to estimate the computational costs of the backward and forward modes of algorithmic differentiation for a wide class of nonsmooth programs. The overhead complexity of the…

Numerical Analysis · Mathematics 2023-02-07 Jérôme Bolte , Ryan Boustany , Edouard Pauwels , Béatrice Pesquet-Popescu

Trading-off variance and complexity in stochastic gradient descent

Stochastic gradient descent is the method of choice for large-scale machine learning problems, by virtue of its light complexity per iteration. However, it lags behind its non-stochastic counterparts with respect to the convergence rate,…

Machine Learning · Statistics 2016-03-23 Vatsal Shah , Megasthenis Asteris , Anastasios Kyrillidis , Sujay Sanghavi

Gradient of the Value Function in Parametric Convex Optimization Problems

We investigate the computation of the gradient of the value function in parametric convex optimization problems. We derive general expression for the gradient of the value function in terms of the cost function, constraints and Lagrange…

Optimization and Control · Mathematics 2016-07-04 Mato Baotić

Proximal-gradient algorithms for fractional programming

In this paper we propose two proximal gradient algorithms for fractional programming problems in real Hilbert spaces, where the numerator is a proper, convex and lower semicontinuous function and the denominator is a smooth function, either…

Optimization and Control · Mathematics 2016-02-01 Radu Ioan Bot , Ernö Robert Csetnek

Gradient and Hessian approximations in Derivative Free Optimization

This work investigates finite differences and the use of interpolation models to obtain approximations to the first and second derivatives of a function. Here, it is shown that if a particular set of points is used in the interpolation…

Optimization and Control · Mathematics 2020-01-24 Ian D. Coope , Rachael Tappenden

Subsampling Algorithms for Semidefinite Programming

We derive a stochastic gradient algorithm for semidefinite optimization using randomization techniques. The algorithm uses subsampling to reduce the computational cost of each iteration and the subsampling ratio explicitly controls…

Optimization and Control · Mathematics 2011-08-30 Alexandre d'Aspremont

Regular Subgradients of Marginal Functions with Applications to Calculus and Bilevel Programming

The paper addresses the study and applications of a broad class of extended-real-valued functions, known as optimal value or marginal functions, which are frequently appeared in variational analysis, parametric optimization, and a variety…

Optimization and Control · Mathematics 2025-02-05 Le Phuoc Hai , Felipe Lara , Boris S. Mordukhovich

Gradient Descent for Low-Rank Functions

Several recent empirical studies demonstrate that important machine learning tasks, e.g., training deep neural networks, exhibit low-rank structure, where the loss function varies significantly in only a few directions of the input space.…

Machine Learning · Computer Science 2022-06-17 Romain Cosson , Ali Jadbabaie , Anuran Makur , Amirhossein Reisizadeh , Devavrat Shah

Analyzing Inexact Hypergradients for Bilevel Learning

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters…

Optimization and Control · Mathematics 2023-11-16 Matthias J. Ehrhardt , Lindon Roberts

Learning complexity of gradient descent and conjugate gradient algorithms

Gradient Descent (GD) and Conjugate Gradient (CG) methods are among the most effective iterative algorithms for solving unconstrained optimization problems, particularly in machine learning and statistical modeling, where they are employed…

Optimization and Control · Mathematics 2024-12-19 Xianqi Jiao , Jia Liu , Zhiping Chen

Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization

Many problems encountered in science and engineering can be formulated as estimating a low-rank object (e.g., matrices and tensors) from incomplete, and possibly corrupted, linear measurements. Through the lens of matrix and tensor…

Machine Learning · Computer Science 2023-10-11 Cong Ma , Xingyu Xu , Tian Tong , Yuejie Chi

Computing the gradients with respect to all parameters of a quantum neural network using a single circuit

Finding gradients is a crucial step in training machine learning models. For quantum neural networks, computing gradients using the parameter-shift rule requires calculating the cost function twice for each adjustable parameter in the…

Quantum Physics · Physics 2025-01-31 Guang Ping He

Computation of Generalized Derivatives for Abs-Smooth Functions by Backward Mode Algorithmic Differentiation and Implications to Deep Learning

Algorithmic differentiation (AD) tools allow to obtain gradient information of a continuously differentiable objective function in a computationally cheap way using the so-called backward mode. It is common practice to use the same tools…

Optimization and Control · Mathematics 2024-12-02 Lukas Baumgärtner , Franz Bethke

Gradient-based dimension reduction of multivariate vector-valued functions

Multivariate functions encountered in high-dimensional uncertainty quantification problems often vary most strongly along a few dominant directions in the input parameter space. We propose a gradient-based method for detecting these…

Analysis of PDEs · Mathematics 2019-11-11 Olivier Zahm , Paul Constantine , Clémentine Prieur , Youssef Marzouk

Gradient Estimation Using Stochastic Computation Graphs

In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external…

Machine Learning · Computer Science 2016-01-06 John Schulman , Nicolas Heess , Theophane Weber , Pieter Abbeel

Gradients without Backpropagation

Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic…

Machine Learning · Computer Science 2022-02-18 Atılım Güneş Baydin , Barak A. Pearlmutter , Don Syme , Frank Wood , Philip Torr

A Relational Gradient Descent Algorithm For Support Vector Machine Training

We consider gradient descent like algorithms for Support Vector Machine (SVM) training when the data is in relational form. The gradient of the SVM objective can not be efficiently computed by known techniques as it suffers from the…

Data Structures and Algorithms · Computer Science 2020-05-13 Mahmoud Abo-Khamis , Sungjin Im , Benjamin Moseley , Kirk Pruhs , Alireza Samadian

How to guess a gradient

How much can you say about the gradient of a neural network without computing a loss or knowing the label? This may sound like a strange question: surely the answer is "very little." However, in this paper, we show that gradients are more…

Machine Learning · Computer Science 2023-12-11 Utkarsh Singhal , Brian Cheung , Kartik Chandra , Jonathan Ragan-Kelley , Joshua B. Tenenbaum , Tomaso A. Poggio , Stella X. Yu