English
Related papers

Related papers: Learning to Initialize Gradient Descent Using Grad…

200 papers

Gradient descent and coordinate descent are well understood in terms of their asymptotic behavior, but less so in a transient regime often used for approximations in machine learning. We investigate how proper initialization can have a…

Machine Learning · Computer Science 2017-06-14 Hadi Daneshmand , Hamed Hassani , Thomas Hofmann

Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule out convergence to critical points of…

Optimization and Control · Mathematics 2018-09-28 Dar Gilboa , Sam Buchanan , John Wright

Adaptive gradient optimization methods, such as Adam, are prevalent in training deep neural networks across diverse machine learning tasks due to their ability to achieve faster convergence. However, these methods often suffer from…

Machine Learning · Computer Science 2025-02-12 Abulikemu Abuduweili , Changliu Liu

Large-scale optimization problems require algorithms both effective and efficient. One such popular and proven algorithm is Stochastic Gradient Descent which uses first-order gradient information to solve these problems. This paper studies…

Optimization and Control · Mathematics 2021-11-11 Theodoros Mamalis , Dusan Stipanovic , Petros Voulgaris

Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation problems. Due to the highly nonconvex nature of the empirical loss, state-of-the-art procedures often…

Machine Learning · Computer Science 2020-06-09 Cong Ma , Kaizheng Wang , Yuejie Chi , Yuxin Chen

Learning rules -- prescriptions for updating model parameters to improve performance -- are typically assumed rather than derived. Why do some learning rules work better than others, and under what assumptions can a given rule be considered…

Machine Learning · Computer Science 2025-11-03 John J. Vastola , Samuel J. Gershman , Kanaka Rajan

We propose a variant of the classical conditional gradient method for sparse inverse problems with differentiable measurement models. Such models arise in many practical problems including superresolution, time-series modeling, and matrix…

Optimization and Control · Mathematics 2015-07-07 Nicholas Boyd , Geoffrey Schiebinger , Benjamin Recht

We study the gradient method under the assumption that an additively inexact gradient is available for, generally speaking, non-convex problems. The non-convexity of the objective function, as well as the use of an inexactness specified…

Optimization and Control · Mathematics 2022-12-13 Boris T. Polyak , Ilia A. Kuruzov , Fedor S. Stonyakin

Learning to optimize - the idea that we can learn from data algorithms that optimize a numerical criterion - has recently been at the heart of a growing number of research efforts. One of the most challenging issues within this approach is…

Machine Learning · Computer Science 2018-02-21 Louis Faury , Flavian Vasile

In this paper we propose several adaptive gradient methods for stochastic optimization. Unlike AdaGrad-type of methods, our algorithms are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of…

We consider chance-constrained problems with discrete random distribution. We aim for problems with a large number of scenarios. We propose a novel method based on the stochastic gradient descent method which performs updates of the…

Optimization and Control · Mathematics 2019-05-28 Lukáš Adam , Martin Branda

Driven by the need to solve increasingly complex optimization problems in signal processing and machine learning, there has been increasing interest in understanding the behavior of gradient-descent algorithms in non-convex environments.…

Optimization and Control · Mathematics 2019-07-04 Stefan Vlaski , Ali H. Sayed

Any gradient descent optimization requires to choose a learning rate. With deeper and deeper models, tuning that learning rate can easily become tedious and does not necessarily lead to an ideal convergence. We propose a variation of the…

Machine Learning · Statistics 2018-04-10 Mathieu Ravaut , Satya Gorti

Despite the recent success of stochastic gradient descent in deep learning, it is often difficult to train a deep neural network with an inappropriate choice of its initial parameters. Even if training is successful, it has been known that…

Machine Learning · Computer Science 2023-02-10 Cheolhyoung Lee , Kyunghyun Cho

Recently there has been significant theoretical progress on understanding the convergence and generalization of gradient-based methods on nonconvex losses with overparameterized models. Nevertheless, many aspects of optimization and…

Machine Learning · Computer Science 2022-09-27 Dominik Stöger , Mahdi Soltanolkotabi

Low-rank matrix estimation is a canonical problem that finds numerous applications in signal processing, machine learning and imaging science. A popular approach in practice is to factorize the matrix into two compact low-rank factors, and…

Machine Learning · Computer Science 2021-06-16 Tian Tong , Cong Ma , Yuejie Chi

We study the data deletion problem for convex models. By leveraging techniques from convex optimization and reservoir sampling, we give the first data deletion algorithms that are able to handle an arbitrarily long sequence of adversarial…

Machine Learning · Statistics 2020-07-07 Seth Neel , Aaron Roth , Saeed Sharifi-Malvajerdi

One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth. This paper demystifies…

Machine Learning · Computer Science 2019-02-06 Simon S. Du , Xiyu Zhai , Barnabas Poczos , Aarti Singh

We propose an optimization proxy in terms of iterative implicit gradient methods for solving constrained optimization problems with nonconvex loss functions. This framework can be applied to a broad range of machine learning settings,…

Optimization and Control · Mathematics 2025-10-14 Harshal D. Kaushik , Ming Jin

Convex-concave min-max problems are ubiquitous in machine learning, and people usually utilize first-order methods (e.g., gradient descent ascent) to find the optimal solution. One feature which separates convex-concave min-max problems…

Optimization and Control · Mathematics 2022-03-09 Mingrui Liu , Francesco Orabona
‹ Prev 1 2 3 10 Next ›