English
Related papers

Related papers: Optimization without Backpropagation

200 papers

Using backpropagation to compute gradients of objective functions for optimization has remained a mainstay of machine learning. Backpropagation, or reverse-mode differentiation, is a special case within the general family of automatic…

Machine Learning · Computer Science 2022-02-18 Atılım Güneş Baydin , Barak A. Pearlmutter , Don Syme , Frank Wood , Philip Torr

The gradients used to train neural networks are typically computed using backpropagation. While an efficient way to obtain exact gradients, backpropagation is computationally expensive, hinders parallelization, and is biologically…

Machine Learning · Computer Science 2026-01-14 Katharina Flügel , Daniel Coquelin , Marie Weiel , Charlotte Debus , Achim Streit , Markus Götz

Forward Gradients - the idea of using directional derivatives in forward differentiation mode - have recently been shown to be utilizable for neural network training while avoiding problems generally associated with backpropagation gradient…

Machine Learning · Computer Science 2023-06-13 Louis Fournier , Stéphane Rivaud , Eugene Belilovsky , Michael Eickenberg , Edouard Oyallon

While backpropagation--reverse-mode automatic differentiation--has been extraordinarily successful in deep learning, it requires two passes (forward and backward) through the neural network and the storage of intermediate activations.…

Machine Learning · Computer Science 2025-11-06 Daniel Wang , Evan Markou , Dylan Campbell

The success of deep learning over the past decade mainly relies on gradient-based optimisation and backpropagation. This paper focuses on analysing the performance of first-order gradient-based optimisation algorithms, gradient descent and…

Optimization and Control · Mathematics 2022-12-08 Behnam Mafakheri , Iman Shames , Jonathan H. Manton

Bilevel optimization has been recently revisited for designing and analyzing algorithms in hyperparameter tuning and meta learning tasks. However, due to its nested structure, evaluating exact gradients for high-dimensional problems is…

Machine Learning · Computer Science 2019-04-09 Amirreza Shaban , Ching-An Cheng , Nathan Hatch , Byron Boots

Working with any gradient-based machine learning algorithm involves the tedious task of tuning the optimizer's hyperparameters, such as its step size. Recent work has shown how the step size can itself be optimized alongside the model…

Machine Learning · Computer Science 2022-10-18 Kartik Chandra , Audrey Xie , Jonathan Ragan-Kelley , Erik Meijer

Tuning hyperparameters of learning algorithms is hard because gradients are usually unavailable. We compute exact gradients of cross-validation performance with respect to all hyperparameters by chaining derivatives backwards through the…

Machine Learning · Statistics 2015-04-03 Dougal Maclaurin , David Duvenaud , Ryan P. Adams

Given the limitations of backpropagation, perturbation-based gradient computation methods have recently gained focus for learning with only forward passes, also referred to as queries. Conventional forward learning consumes enormous queries…

Machine Learning · Computer Science 2025-03-11 Tao Ren , Zishi Zhang , Jinyang Jiang , Guanghao Li , Zeliang Zhang , Mingqian Feng , Yijie Peng

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modeled as the solution to an optimization problem. Here the exact gradient with respect to the hyperparameters…

Optimization and Control · Mathematics 2023-11-16 Matthias J. Ehrhardt , Lindon Roberts

We propose randomized subspace gradient methods for high-dimensional constrained optimization. While there have been similarly purposed studies on unconstrained optimization problems, there have been few on constrained optimization problems…

Optimization and Control · Mathematics 2023-07-10 Ryota Nozawa , Pierre-Louis Poirion , Akiko Takeda

We study two procedures (reverse-mode and forward-mode) for computing the gradient of the validation error with respect to the hyperparameters of any iterative learning algorithm such as stochastic gradient descent. These procedures mirror…

Machine Learning · Statistics 2017-12-13 Luca Franceschi , Michele Donini , Paolo Frasconi , Massimiliano Pontil

Local optimization presents a promising approach to expensive, high-dimensional black-box optimization by sidestepping the need to globally explore the search space. For objective functions whose gradient cannot be evaluated directly,…

Machine Learning · Computer Science 2023-01-18 Quan Nguyen , Kaiwen Wu , Jacob R. Gardner , Roman Garnett

Recent advances in convex optimization have leveraged computer-assisted proofs to develop optimized first-order methods that improve over classical algorithms. However, each optimized method is specially tailored for a particular problem…

Optimization and Control · Mathematics 2025-07-01 Jinho Bok , Jason M. Altschuler

The performance of optimization methods is often tied to the spectrum of the objective Hessian. Yet, conventional assumptions, such as smoothness, do often not enable us to make finely-grained convergence statements -- particularly not for…

Optimization and Control · Mathematics 2024-02-08 Nikita Doikov , Sebastian U. Stich , Martin Jaggi

The back-propagation algorithm has long been the de-facto standard in optimizing weights and biases in neural networks, particularly in cutting-edge deep learning models. Its widespread adoption in fields like natural language processing,…

Computer Vision and Pattern Recognition · Computer Science 2023-07-04 Sidike Paheding , Abel A. Reyes-Angulo

Motivated by the problem of tuning hyperparameters in machine learning, we present a new approach for gradually and adaptively optimizing an unknown function using estimated gradients. We validate the empirical performance of the proposed…

Machine Learning · Computer Science 2019-06-05 Weijia Shao , Christian Geißler , Fikret Sivrikaya

In this paper, we consider the composite optimization problem, where the objective function integrates a continuously differentiable loss function with a nonsmooth regularization term. Moreover, only the function values for the…

Optimization and Control · Mathematics 2024-01-09 Shanglin Liu , Lei Wang , Nachuan Xiao , Xin Liu

The vast majority of optimization and online learning algorithms today require some prior information about the data (often in the form of bounds on gradients or on the optimal parameter value). When this information is not available, these…

Machine Learning · Computer Science 2017-06-07 Ashok Cutkosky , Kwabena Boahen

Extrapolation is a well-known technique for solving convex optimization and variational inequalities and recently attracts some attention for non-convex optimization. Several recent works have empirically shown its success in some machine…

Optimization and Control · Mathematics 2019-02-06 Yi Xu , Zhuoning Yuan , Sen Yang , Rong Jin , Tianbao Yang
‹ Prev 1 2 3 10 Next ›