Related papers: Stochastic Recursive Gradient Algorithm for Noncon…
In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well as its practical variant SARAH+, as a novel approach to the finite-sum minimization problems. Different from the vanilla SGD and other modern stochastic…
The StochAstic Recursive grAdient algoritHm (SARAH) algorithm is a variance reduced variant of the Stochastic Gradient Descent (SGD) algorithm that needs a gradient of the objective function from time to time. In this paper, we remove the…
We develop and analyze a variant of the SARAH algorithm, which does not require computation of the exact gradient. Thus this new method can be applied to general expectation minimization problems rather than only finite sum problems. While…
In this note we propose a new variant of the hybrid variance-reduced proximal gradient method in [7] to solve a common stochastic composite nonconvex optimization problem under standard assumptions. We simply replace the independent…
StochAstic Recursive grAdient algoritHm (SARAH), originally proposed for convex optimization and also proven to be effective for general nonconvex optimization, has received great attention due to its simple recursive framework for updating…
We introduce two new stochastic conjugate frameworks for a class of nonconvex and possibly also nonsmooth optimization problems. These frameworks are built upon Stochastic Recursive Gradient Algorithm (SARAH) and we thus refer to them as…
We present AI-SARAH, a practical variant of SARAH. As a variant of SARAH, this algorithm employs the stochastic recursive gradient yet adjusts step-size based on local geometry. AI-SARAH implicitly computes step-size and efficiently…
We propose a new stochastic first-order algorithmic framework to solve stochastic composite nonconvex optimization problems that covers both finite-sum and expectation settings. Our algorithms rely on the SARAH estimator introduced in…
We present a uniform analysis of biased stochastic gradient methods for minimizing convex, strongly convex, and non-convex composite objectives, and identify settings where bias is useful in stochastic gradient estimation. The framework we…
This paper considers decentralized minimization of $N:=nm$ smooth non-convex cost functions equally divided over a directed network of $n$ nodes. Specifically, we describe a stochastic first-order gradient method, called GT-SARAH, that…
We provide the first importance sampling variants of variance reduced algorithms for empirical risk minimization with non-convex loss functions. In particular, we analyze non-convex versions of SVRG, SAGA and SARAH. Our methods have the…
We propose ZeroSARAH -- a novel variant of the variance-reduced method SARAH (Nguyen et al., 2017) -- for minimizing the average of a large number of nonconvex functions $\frac{1}{n}\sum_{i=1}^{n}f_i(x)$. To the best of our knowledge, in…
This paper focuses on stochastic proximal gradient methods for optimizing a smooth non-convex loss function with a non-smooth non-convex regularizer and convex constraints. To the best of our knowledge we present the first non-asymptotic…
We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large…
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (SVRG) methods for them. SVRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient…
Our work focuses on stochastic gradient methods for optimizing a smooth non-convex loss function with a non-smooth non-convex regularizer. Research on this class of problem is quite limited, and until recently no non-asymptotic convergence…
In this paper, we introduce a new stochastic approximation (SA) type algorithm, namely the randomized stochastic gradient (RSG) method, for solving an important class of nonlinear (possibly nonconvex) stochastic programming (SP) problems.…
Stochastic Gradient (SG) is the defacto iterative technique to solve stochastic optimization (SO) problems with a smooth (non-convex) objective $f$ and a stochastic first-order oracle. SG's attractiveness is due in part to its simplicity of…
Here we study non-convex composite optimization: first, a finite-sum of smooth but non-convex functions, and second, a general function that admits a simple proximal mapping. Most research on stochastic methods for composite optimization…
Adaptivity is an important yet under-studied property in modern optimization theory. The gap between the state-of-the-art theory and the current practice is striking in that algorithms with desirable theoretical guarantees typically involve…