Continuous-time Models for Stochastic Optimization Algorithms

Antonio Orvieto; Aurelien Lucchi

Continuous-time Models for Stochastic Optimization Algorithms

Optimization and Control 2020-03-12 v3 Machine Learning

Authors: Antonio Orvieto , Aurelien Lucchi

Abstract

We propose new continuous-time formulations for first-order stochastic optimization algorithms such as mini-batch gradient descent and variance-reduced methods. We exploit these continuous-time models, together with simple Lyapunov analysis as well as tools from stochastic calculus, in order to derive convergence bounds for various types of non-convex functions. Guided by such analysis, we show that the same Lyapunov arguments hold in discrete-time, leading to matching rates. In addition, we use these models and Ito calculus to infer novel insights on the dynamics of SGD, proving that a decreasing learning rate acts as time warping or, equivalently, as landscape stretching.

Keywords

stochastic optimization stochastic gradient descent gradient descent optimization

Cite

@article{arxiv.1810.02565,
  title  = {Continuous-time Models for Stochastic Optimization Algorithms},
  author = {Antonio Orvieto and Aurelien Lucchi},
  journal= {arXiv preprint arXiv:1810.02565},
  year   = {2020}
}

Comments

33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada

Continuous-time Models for Stochastic Optimization Algorithms

Abstract

Keywords

Cite

Comments

Related papers