Related papers: Stochastic Gradient Descent for Nonparametric Addi…

Simultaneous Model Selection and Optimization through Parameter-free Stochastic Learning

Stochastic gradient descent algorithms for training linear and kernel predictors are gaining more and more importance, thanks to their scalability. While various methods have been proposed to speed up their convergence, the model selection…

Machine Learning · Computer Science 2014-06-17 Francesco Orabona

Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD). This effectively removes all needs for tuning, while automatically reducing learning rates over time on…

Machine Learning · Computer Science 2013-03-28 Tom Schaul , Yann LeCun

Accelerated Almost-Sure Convergence Rates for Nonconvex Stochastic Gradient Descent using Stochastic Learning Rates

Large-scale optimization problems require algorithms both effective and efficient. One such popular and proven algorithm is Stochastic Gradient Descent which uses first-order gradient information to solve these problems. This paper studies…

Optimization and Control · Mathematics 2021-11-11 Theodoros Mamalis , Dusan Stipanovic , Petros Voulgaris

No More Pesky Learning Rates

The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time. We propose a method to automatically adjust multiple learning rates so as to minimize the expected error at any…

Machine Learning · Statistics 2013-02-19 Tom Schaul , Sixin Zhang , Yann LeCun

Stochastic Gradient Descent for Constrained Optimization based on Adaptive Relaxed Barrier Functions

This paper presents a novel stochastic gradient descent algorithm for constrained optimization. The proposed algorithm randomly samples constraints and components of the finite sum objective function and relies on a relaxed logarithmic…

Optimization and Control · Mathematics 2025-05-13 Naum Dimitrieski , Jing Cao , Christian Ebenbauer

Stochastic Adaptive Gradient Descent Without Descent

We introduce a new adaptive step-size strategy for convex optimization with stochastic gradient that exploits the local geometry of the objective function only by means of a first-order stochastic oracle and without any hyper-parameter…

Machine Learning · Computer Science 2025-09-19 Jean-François Aujol , Jérémie Bigot , Camille Castera

Stochastic Gradient Descent Revisited

Stochastic gradient descent (SGD) has been a go-to algorithm for nonconvex stochastic optimization problems arising in machine learning. Its theory however often requires a strong framework to guarantee convergence properties. We hereby…

Optimization and Control · Mathematics 2025-03-11 Azar Louzi

Adaptive Gradient Descent for Optimal Control of Parabolic Equations with Random Parameters

In this paper we extend the adaptive gradient descent (AdaGrad) algorithm to the optimal distributed control of parabolic partial differential equations with uncertain parameters. This stochastic optimization method achieves an improved…

Optimization and Control · Mathematics 2021-10-22 Yanzhao Cao , Somak Das , Hans-Werner van Wyk

Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization

Online minimization of an unknown convex function over the interval $[0,1]$ is considered under first-order stochastic bandit feedback, which returns a random realization of the gradient of the function at each query point. Without knowing…

Machine Learning · Statistics 2020-02-21 Sattar Vakili , Sudeep Salgia , Qing Zhao

Learning Rate Adaptation for Federated and Differentially Private Learning

We propose an algorithm for the adaptation of the learning rate for stochastic gradient descent (SGD) that avoids the need for validation set use. The idea for the adaptiveness comes from the technique of extrapolation: to get an estimate…

Machine Learning · Statistics 2020-08-28 Antti Koskela , Antti Honkela

A Robust Adaptive Stochastic Gradient Method for Deep Learning

Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of…

Machine Learning · Computer Science 2017-03-03 Caglar Gulcehre , Jose Sotelo , Marcin Moczulski , Yoshua Bengio

Stochastic Gradient Descent for Nonconvex Learning without Bounded Gradient Assumptions

Stochastic gradient descent (SGD) is a popular and efficient method with wide applications in training deep neural nets and other nonconvex models. While the behavior of SGD is well understood in the convex learning setting, the existing…

Machine Learning · Computer Science 2019-12-16 Yunwen Lei , Ting Hu , Guiying Li , Ke Tang

Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?

Many modern learning tasks involve fitting nonlinear models to data which are trained in an overparameterized regime where the parameters of the model exceed the size of the training dataset. Due to this overparameterization, the training…

Machine Learning · Computer Science 2018-12-27 Samet Oymak , Mahdi Soltanolkotabi

Online Learning Under A Separable Stochastic Approximation Framework

We propose an online learning algorithm for a class of machine learning models under a separable stochastic approximation framework. The essence of our idea lies in the observation that certain parameters in the models are easier to…

Machine Learning · Computer Science 2023-05-23 Min Gan , Xiang-xiang Su , Guang-yong Chen , Jing Chen

Stochastic gradient descent with random learning rate

We propose to optimize neural networks with a uniformly-distributed random learning rate. The associated stochastic gradient descent algorithm can be approximated by continuous stochastic equations and analyzed within the Fokker-Planck…

Machine Learning · Computer Science 2020-10-13 Daniele Musso

ADASECANT: Robust Adaptive Secant Method for Stochastic Gradient

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the…

Machine Learning · Computer Science 2015-11-03 Caglar Gulcehre , Marcin Moczulski , Yoshua Bengio

Gradient Descent with Provably Tuned Learning-rate Schedules

Gradient-based iterative optimization methods are the workhorse of modern machine learning. They crucially rely on careful tuning of parameters like learning rate and momentum. However, one typically sets them using heuristic approaches…

Machine Learning · Computer Science 2025-12-05 Dravyansh Sharma

Towards Learning Stochastic Population Models by Gradient Descent

Increasing effort is put into the development of methods for learning mechanistic models from data. This task entails not only the accurate estimation of parameters but also a suitable model structure. Recent work on the discovery of…

Machine Learning · Computer Science 2024-07-01 Justin N. Kreikemeyer , Philipp Andelfinger , Adelinde M. Uhrmacher

Reinforced stochastic gradient descent for deep neural network learning

Stochastic gradient descent (SGD) is a standard optimization method to minimize a training error with respect to network parameters in modern neural network learning. However, it typically suffers from proliferation of saddle points in the…

Machine Learning · Computer Science 2017-11-23 Haiping Huang , Taro Toyoizumi

Resolving learning rates adaptively by locating Stochastic Non-Negative Associated Gradient Projection Points using line searches

Learning rates in stochastic neural network training are currently determined a priori to training, using expensive manual or automated iterative tuning. This study proposes gradient-only line searches to resolve the learning rate for…

Machine Learning · Statistics 2020-01-16 Dominic Kafka , Daniel N. Wilke