English
Related papers

Related papers: Improved Binary Forward Exploration: Learning Rate…

200 papers

In this paper, a new gradient-based optimization approach by automatically adjusting the learning rate is proposed. This approach can be applied to design non-adaptive learning rate and adaptive learning rate. Firstly, I will introduce the…

Machine Learning · Computer Science 2022-07-07 Xin Cao

In this paper, we present a novel stochastic optimization method, which uses the binary search technique with first order gradient based optimization method, called Binary Search Gradient Optimization (BSG) or BiGrad. In this optimization…

Machine Learning · Computer Science 2020-07-28 Vijay Pandey

Adaptive gradient methods for stochastic optimization adjust the learning rate for each parameter locally. However, there is also a global learning rate which must be tuned in order to get the best performance. In this paper, we present a…

Machine Learning · Computer Science 2018-06-12 Hiroaki Hayashi , Jayanth Koushik , Graham Neubig

Stochastic gradient algorithms are the main focus of large-scale optimization problems and led to important successes in the recent advancement of the deep learning algorithms. The convergence of SGD depends on the careful choice of…

Machine Learning · Computer Science 2017-03-03 Caglar Gulcehre , Jose Sotelo , Marcin Moczulski , Yoshua Bengio

We introduce a novel algorithm for gradient-based optimization of stochastic objective functions. The method may be seen as a variant of SGD with momentum equipped with an adaptive learning rate automatically adjusted by an 'energy'…

Optimization and Control · Mathematics 2022-03-24 Hailiang Liu , Xuping Tian

We propose a new, more general approach to the design of stochastic gradient-based optimization methods for machine learning. In this new framework, optimizers assume access to a batch of gradient estimates per iteration, rather than a…

Machine Learning · Computer Science 2021-12-02 Julius Kunze , James Townsend , David Barber

Conditional stochastic optimization covers a variety of applications ranging from invariant learning and causal inference to meta-learning. However, constructing unbiased gradient estimators for such problems is challenging due to the…

Optimization and Control · Mathematics 2024-06-04 Yifan Hu , Siqi Zhang , Xin Chen , Niao He

Stochastic gradient algorithms have been the main focus of large-scale learning problems and they led to important successes in machine learning. The convergence of SGD depends on the careful choice of learning rate and the amount of the…

Machine Learning · Computer Science 2015-11-03 Caglar Gulcehre , Marcin Moczulski , Yoshua Bengio

This paper proposes SplitSGD, a new dynamic learning rate schedule for stochastic optimization. This method decreases the learning rate for better adaptation to the local geometry of the objective function whenever a stationary phase is…

Machine Learning · Statistics 2024-02-20 Matteo Sordello , Niccolò Dalmasso , Hangfeng He , Weijie Su

We present a novel stochastic approach to binary optimization for optimal experimental design (OED) for Bayesian inverse problems governed by mathematical models such as partial differential equations. The OED utility function, namely, the…

Optimization and Control · Mathematics 2022-06-28 Ahmed Attia , Sven Leyffer , Todd Munson

In the rapidly advancing field of deep learning, optimising deep neural networks is paramount. This paper introduces a novel method, Enhanced Velocity Estimation (EVE), which innovatively applies different learning rates to distinct…

Machine Learning · Computer Science 2023-08-22 Afshin Khadangi

We introduce a general method for improving the convergence rate of gradient-based optimizers that is easy to implement and works well in practice. We demonstrate the effectiveness of the method in a range of optimization problems by…

Machine Learning · Computer Science 2018-08-23 Atilim Gunes Baydin , Robert Cornish , David Martinez Rubio , Mark Schmidt , Frank Wood

The vast majority of successful deep neural networks are trained using variants of stochastic gradient descent (SGD) algorithms. Recent attempts to improve SGD can be broadly categorized into two approaches: (1) adaptive learning rate…

Machine Learning · Computer Science 2019-12-04 Michael R. Zhang , James Lucas , Geoffrey Hinton , Jimmy Ba

Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are…

Machine Learning · Computer Science 2024-03-14 S. Ilker Birbil , Ozgur Martin , Gonenc Onay , Figen Oztoprak

The convergence behavior of mini-batch stochastic gradient descent (SGD) is highly sensitive to the batch size and learning rate settings. Recent theoretical studies have identified the existence of a critical batch size that minimizes…

Machine Learning · Computer Science 2025-08-08 Hikaru Umeda , Hideaki Iiduka

We present a coupled system of ODEs which, when discretized with a constant time step/learning rate, recovers Nesterov's accelerated gradient descent algorithm. The same ODEs, when discretized with a decreasing learning rate, leads to novel…

Optimization and Control · Mathematics 2020-09-02 Maxime Laborde , Adam M. Oberman

In this work, multiplicative stochasticity is applied to the learning rate of stochastic optimization algorithms, giving rise to stochastic learning-rate schemes. In-expectation theoretical convergence results of Stochastic Gradient Descent…

Optimization and Control · Mathematics 2022-03-22 Theodoros Mamalis , Dusan Stipanovic , Petros Voulgaris

We propose an algorithm for the adaptation of the learning rate for stochastic gradient descent (SGD) that avoids the need for validation set use. The idea for the adaptiveness comes from the technique of extrapolation: to get an estimate…

Machine Learning · Statistics 2020-08-28 Antti Koskela , Antti Honkela

We propose a stochastic optimization method for minimizing loss functions, expressed as an expected value, that adaptively controls the batch size used in the computation of gradient approximations and the step size used to move along such…

Machine Learning · Computer Science 2020-03-04 Achraf Bahamou , Donald Goldfarb

We propose EAGLE update rule, a novel optimization method that accelerates loss convergence during the early stages of training by leveraging both current and previous step parameter and gradient values. The update algorithm estimates…

Machine Learning · Computer Science 2025-02-04 Takumi Fujimoto , Hiroaki Nishi
‹ Prev 1 2 3 10 Next ›