Escape saddle points by a simple gradient-descent based algorithm

Chenyi Zhang; Tongyang Li

Escape saddle points by a simple gradient-descent based algorithm

Optimization and Control 2021-11-30 v1 Machine Learning Machine Learning

Authors: Chenyi Zhang , Tongyang Li

Abstract

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function $f\colon\mathbb{R}^n\to\mathbb{R}$ , it outputs an $\epsilon$ -approximate second-order stationary point in $\tilde{O}(\log n/\epsilon^{1.75})$ iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with $\tilde{O}((\log n)^{4}/\epsilon^{2})$ or $\tilde{O}((\log n)^{6}/\epsilon^{1.75})$ iterations, our algorithm is polynomially better in terms of $\log n$ and matches their complexities in terms of $1/\epsilon$ . For the stochastic setting, our algorithm outputs an $\epsilon$ -approximate second-order stationary point in $\tilde{O}((\log n)^{2}/\epsilon^{4})$ iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in $\log n$ compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.

Keywords

saddle point optimization algorithms gradient descent optimization

Cite

@article{arxiv.2111.14069,
  title  = {Escape saddle points by a simple gradient-descent based algorithm},
  author = {Chenyi Zhang and Tongyang Li},
  journal= {arXiv preprint arXiv:2111.14069},
  year   = {2021}
}

Comments

34 pages, 8 figures, to appear in the 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

Escape saddle points by a simple gradient-descent based algorithm

Abstract

Keywords

Cite

Comments

Related papers