Related papers: Easing Optimization Paths: a Circuit Perspective

A Comprehensive Study on Optimization Strategies for Gradient Descent In Deep Learning

One of the most important parts of Artificial Neural Networks is minimizing the loss functions which tells us how good or bad our model is. To minimize these losses we need to tune the weights and biases. Also to calculate the minimum value…

Machine Learning · Computer Science 2021-01-08 Kaustubh Yadav

Optimizing ML Training with Metagradient Descent

A major challenge in training large-scale machine learning models is configuring the training process to maximize model performance, i.e., finding the best training setup from a vast design space. In this work, we unlock a gradient-based…

Machine Learning · Statistics 2025-03-19 Logan Engstrom , Andrew Ilyas , Benjamin Chen , Axel Feldmann , William Moses , Aleksander Madry

A New Perspective of Accelerated Gradient Methods: The Controlled Invariant Manifold Approach

Gradient Descent (GD) is a ubiquitous algorithm for finding the optimal solution to an optimization problem. For reduced computational complexity, the optimal solution $\mathrm{x^*}$ of the optimization problem must be attained in a minimum…

Optimization and Control · Mathematics 2023-06-01 Revati Gunjal , Sushama Wagh , Syed Shadab Nayyer , Alex Stankovic , Navdeep M. Singh

A Gentle Introduction to Gradient-Based Optimization and Variational Inequalities for Machine Learning

The rapid progress in machine learning in recent years has been based on a highly productive connection to gradient-based optimization. Further progress hinges in part on a shift in focus from pattern recognition to decision-making and…

Machine Learning · Computer Science 2024-02-27 Neha S. Wadia , Yatin Dandi , Michael I. Jordan

Efficient learning with robust gradient descent

Minimizing the empirical risk is a popular training strategy, but for learning tasks where the data may be noisy or heavy-tailed, one may require many observations in order to generalize well. To achieve better performance under less…

Machine Learning · Statistics 2018-10-16 Matthew J. Holland , Kazushi Ikeda

Descent-Net: Learning Descent Directions for Constrained Optimization

Deep learning approaches, known for their ability to model complex relationships and fast execution, are increasingly being applied to solve large optimization problems. However, existing methods often face challenges in simultaneously…

Optimization and Control · Mathematics 2025-12-16 Zisheng Zhou , Dengyu Zheng , Zirui Chen , Shixiang Chen

Deep Neural Networks with Short Circuits for Improved Gradient Learning

Deep neural networks have achieved great success both in computer vision and natural language processing tasks. However, mostly state-of-art methods highly rely on external training or computing to improve the performance. To alleviate the…

Machine Learning · Computer Science 2020-09-25 Ming Yan , Xueli Xiao , Joey Tianyi Zhou , Yi Pan

Analysis of Natural Gradient Descent for Multilayer Neural Networks

Natural gradient descent is a principled method for adapting the parameters of a statistical model on-line using an underlying Riemannian parameter space to redefine the direction of steepest descent. The algorithm is examined via methods…

Disordered Systems and Neural Networks · Physics 2009-10-31 Magnus Rattray , David Saad

Natural Gradient Optimization for Optical Quantum Circuits

Optical quantum circuits can be optimized using gradient descent methods, as the gates in a circuit can be parametrized by continuous parameters. However, the parameter space as seen by the cost function is not Euclidean, which means that…

Quantum Physics · Physics 2022-05-11 Yuan Yao , Pierre Cussenot , Richard A. Wolf , Filippo M. Miatto

Gradient descent revisited via an adaptive online learning rate

Any gradient descent optimization requires to choose a learning rate. With deeper and deeper models, tuning that learning rate can easily become tedious and does not necessarily lead to an ideal convergence. We propose a variation of the…

Machine Learning · Statistics 2018-04-10 Mathieu Ravaut , Satya Gorti

An overview of gradient descent optimization algorithms

Gradient descent optimization algorithms, while increasingly popular, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by. This article aims to provide the reader with…

Machine Learning · Computer Science 2017-06-16 Sebastian Ruder

Gradient Descent based Optimization Algorithms for Deep Learning Models Training

In this paper, we aim at providing an introduction to the gradient descent based optimization algorithms for learning deep neural network models. Deep learning models involving multiple nonlinear projection layers are very challenging to…

Machine Learning · Computer Science 2019-03-12 Jiawei Zhang

Gradient-Descent for Randomized Controllers under Partial Observability

Randomization is a powerful technique to create robust controllers, in particular in partially observable settings. The degrees of randomization have a significant impact on the system performance, yet they are intricate to get right. The…

Logic in Computer Science · Computer Science 2021-11-09 Linus Heck , Jip Spel , Sebastian Junges , Joshua Moerman , Joost-Pieter Katoen

Acceleration of Gradient-based Path Integral Method for Efficient Optimal and Inverse Optimal Control

This paper deals with a new accelerated path integral method, which iteratively searches optimal controls with a small number of iterations. This study is based on the recent observations that a path integral method for reinforcement…

Systems and Control · Computer Science 2019-10-08 Masashi Okada , Tadahiro Taniguchi

Optimizing User Interface Layouts via Gradient Descent

Automating parts of the user interface (UI) design process has been a longstanding challenge. We present an automated technique for optimizing the layouts of mobile UIs. Our method uses gradient descent on a neural network model of task…

Human-Computer Interaction · Computer Science 2020-02-26 Peitong Duan , Casimir Wierzynski , Lama Nachman

Optimization of Linear Multi-Agent Dynamical Systems via Feedback Distributed Gradient Descent Methods

Feedback optimization is an increasingly popular control paradigm to optimize dynamical systems, accounting for control objectives that concern the system operation at steady-state. Existing feedback optimization techniques heavily rely on…

Optimization and Control · Mathematics 2025-04-08 Amir Mehrnoosh , Gianluca Bianchin

Towards Guided Descent: Optimization Algorithms for Training Neural Networks At Scale

Neural network optimization remains one of the most consequential yet poorly understood challenges in modern AI research, where improvements in training algorithms can lead to enhanced feature learning in foundation models,…

Machine Learning · Computer Science 2025-12-23 Ansh Nagwekar

Gradient Descent, Stochastic Optimization, and Other Tales

The goal of this paper is to debunk and dispel the magic behind black-box optimizers and stochastic optimizers. It aims to build a solid foundation on how and why the techniques work. This manuscript crystallizes this knowledge by deriving…

Machine Learning · Computer Science 2024-01-15 Jun Lu

Coherent Gradients: An Approach to Understanding Generalization in Gradient Descent-based Optimization

An open question in the Deep Learning community is why neural networks trained with Gradient Descent generalize well on real datasets even though they are capable of fitting random data. We propose an approach to answering this question…

Machine Learning · Computer Science 2020-02-26 Satrajit Chatterjee

A Fixed-Point of View on Gradient Methods for Big Data

Interpreting gradient methods as fixed-point iterations, we provide a detailed analysis of those methods for minimizing convex objective functions. Due to their conceptual and algorithmic simplicity, gradient methods are widely used in…

Machine Learning · Statistics 2017-08-16 Alexander Jung