Related papers: First-order Methods Almost Always Avoid Saddle Poi…

First-order methods almost always avoid saddle points: the case of vanishing step-sizes

In a series of papers \cite{LSJR16, PP17, LPP}, it was established that some of the most commonly used first order methods almost surely (under random initializations) and with step-size being small enough, avoid strict saddle points, as…

Optimization and Control · Mathematics 2025-09-30 Ioannis Panageas , Georgios Piliouras , Xiao Wang

Comment on "First-order methods almost always avoid strict saddle points"

The analysis on the global stability of Riemannian gradient descent method in manifold optimization (i.e., it avoids strict saddle points for almost all initializations) due to Lee et al. (Math. Program. 176:311-337) is corrected. Moreover,…

Optimization and Control · Mathematics 2022-04-04 Jinyang Zheng , Yong Xia

A Generic Approach for Escaping Saddle points

A central challenge to using first-order methods for optimizing nonconvex problems is the presence of saddle points. First-order methods often get stuck at saddle points, greatly deteriorating their performance. Typically, to escape from…

Machine Learning · Computer Science 2017-09-06 Sashank J Reddi , Manzil Zaheer , Suvrit Sra , Barnabas Poczos , Francis Bach , Ruslan Salakhutdinov , Alexander J Smola

Escaping Saddle Points with the Successive Convex Approximation Algorithm

Optimizing non-convex functions is of primary importance in the vast majority of machine learning algorithms. Even though many gradient descent based algorithms have been studied, successive convex approximation based algorithms have been…

Optimization and Control · Mathematics 2019-03-06 Amrit Singh Bedi , Ketan Rajawat , Vaneet Aggarwal

Heavy-ball Algorithms Always Escape Saddle Points

Nonconvex optimization algorithms with random initialization have attracted increasing attention recently. It has been showed that many first-order methods always avoid saddle points with random starting points. In this paper, we answer a…

Optimization and Control · Mathematics 2019-07-24 Tao Sun , Dongsheng Li , Zhe Quan , Hao Jiang , Shengguo Li , Yong Dou

Efficiently avoiding saddle points with zero order methods: No gradients required

We consider the case of derivative-free algorithms for non-convex optimization, also known as zero order algorithms, that use only function evaluations rather than gradients. For a wide variety of gradient approximators based on finite…

Optimization and Control · Mathematics 2019-10-30 Lampros Flokas , Emmanouil-Vasileios Vlatakis-Gkaragkounis , Georgios Piliouras

Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity

We study the optimization of non-convex functions that are not necessarily smooth (gradient and/or Hessian are Lipschitz) using first order methods. Smoothness is a restrictive assumption in machine learning in both theory and practice,…

Optimization and Control · Mathematics 2025-06-27 Daniel Yiming Cao , August Y. Chen , Karthik Sridharan , Benjamin Tang

First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

Two classes of methods have been proposed for escaping from saddle points with one using the second-order information carried by the Hessian and the other adding the noise into the first-order information. The existing analysis for…

Optimization and Control · Mathematics 2018-03-05 Yi Xu , Rong Jin , Tianbao Yang

Efficient approaches for escaping higher order saddle points in non-convex optimization

Local search heuristics for non-convex optimizations are popular in applied machine learning. However, in general it is hard to guarantee that such algorithms even converge to a local minimum, due to the existence of complicated saddle…

Machine Learning · Computer Science 2016-02-19 Anima Anandkumar , Rong Ge

Escaping From Saddle Points Using Asynchronous Coordinate Gradient Descent

Large-scale non-convex optimization problems are expensive to solve due to computational and memory costs. To reduce the costs, first-order (computationally efficient) and asynchronous-parallel (memory efficient) algorithms are necessary to…

Optimization and Control · Mathematics 2022-11-21 Marco Bornstein , Jin-Peng Liu , Jingling Li , Furong Huang

Stochastic noise can be helpful for variational quantum algorithms

Saddle points constitute a crucial challenge for first-order gradient descent algorithms. In notions of classical machine learning, they are avoided for example by means of stochastic gradient descent methods. In this work, we provide…

Quantum Physics · Physics 2025-05-26 Junyu Liu , Frederik Wilde , Antonio Anna Mele , Xin Jin , Liang Jiang , Jens Eisert

On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems

We introduce a class of first-order methods for smooth constrained optimization that are based on an analogy to non-smooth dynamical systems. Two distinctive features of our approach are that (i) projections or optimizations over the entire…

Optimization and Control · Mathematics 2025-04-15 Michael Muehlebach , Michael I. Jordan

On Solving Minimization and Min-Max Problems by First-Order Methods with Relative Error in Gradients

First-order methods for minimization and saddle point (min-max) problems are widely used for solving large-scale problems, in particular arising in machine learning. The majority of works obtain favorable complexity guarantees of such…

Optimization and Control · Mathematics 2025-12-10 Artem Vasin , Valery Krivchenko , Dmitry Kovalev , Fedyor Stonyakin , Nazarii Tupitsa , Pavel Dvurechensky , Mohammad Alkousa , Nikita Kornilov , Alexander Gasnikov

Charged Point Normalization: An Efficient Solution to the Saddle Point Problem

Recently, the problem of local minima in very high dimensional non-convex optimization has been challenged and the problem of saddle points has been introduced. This paper introduces a dynamic type of normalization that forces the system to…

Machine Learning · Computer Science 2017-02-08 Armen Aghajanyan

Efficient Dictionary Learning with Gradient Descent

Randomly initialized first-order optimization algorithms are the method of choice for solving many high-dimensional nonconvex problems in machine learning, yet general theoretical guarantees cannot rule out convergence to critical points of…

Optimization and Control · Mathematics 2018-09-28 Dar Gilboa , Sam Buchanan , John Wright

How to Escape Saddle Points Efficiently

This paper shows that a perturbed form of gradient descent converges to a second-order stationary point in a number iterations which depends only poly-logarithmically on dimension (i.e., it is almost "dimension-free"). The convergence rate…

Machine Learning · Computer Science 2017-03-03 Chi Jin , Rong Ge , Praneeth Netrapalli , Sham M. Kakade , Michael I. Jordan

Simba: A Scalable Bilevel Preconditioned Gradient Method for Fast Evasion of Flat Areas and Saddle Points

The convergence behaviour of first-order methods can be severely slowed down when applied to high-dimensional non-convex functions due to the presence of saddle points. If, additionally, the saddles are surrounded by large plateaus, it is…

Optimization and Control · Mathematics 2023-09-12 Nick Tsipinakis , Panos Parpas

Second-order methods for provably escaping strict saddle points in composite nonconvex and nonsmooth optimization

This study introduces two second-order methods designed to provably avoid saddle points in composite nonconvex optimization problems: (i) a nonsmooth trust-region method and (ii) a curvilinear linesearch method. These developments are…

Optimization and Control · Mathematics 2025-06-30 Alexander Bodard , Masoud Ahookhosh , Panagiotis Patrinos

Linear Regularizers Enforce the Strict Saddle Property

Satisfaction of the strict saddle property has become a standard assumption in non-convex optimization, and it ensures that many first-order optimization algorithms will almost always escape saddle points. However, functions exist in…

Optimization and Control · Mathematics 2022-08-23 Matthew Ubl , Kasra Yazdani , Matthew T. Hale

Identifying and attacking the saddle point problem in high-dimensional non-convex optimization

A central challenge to many fields of science and engineering involves minimizing non-convex error functions over continuous, high dimensional spaces. Gradient descent or quasi-Newton methods are almost ubiquitously used to perform such…

Machine Learning · Computer Science 2014-06-11 Yann Dauphin , Razvan Pascanu , Caglar Gulcehre , Kyunghyun Cho , Surya Ganguli , Yoshua Bengio