Related papers: SAGA with Arbitrary Sampling
We provide the first importance sampling variants of variance reduced algorithms for empirical risk minimization with non-convex loss functions. In particular, we analyze non-convex versions of SVRG, SAGA and SARAH. Our methods have the…
We describe a novel optimization method for finite sums (such as empirical risk minimization problems) building on the recently introduced SAGA method. Our method achieves an accelerated convergence rate on strongly convex smooth problems.…
SAGA is a fast incremental gradient method on the finite sum problem and its effectiveness has been tested on a vast of applications. In this paper, we analyze SAGA on a class of non-strongly convex and non-convex statistical problem such…
We propose the stochastic average gradient (SAG) method for optimizing the sum of a finite number of smooth convex functions. Like stochastic gradient (SG) methods, the SAG method's iteration cost is independent of the number of terms in…
Stochastic variance-reduced algorithms such as Stochastic Average Gradient (SAG) and SAGA, and their deterministic counterparts like the Incremental Aggregated Gradient (IAG) method, have been extensively studied in large-scale machine…
Stochastic optimization naturally appear in many application areas, including machine learning. Our goal is to go further in the analysis of the Stochastic Average Gradient Accelerated (SAGA) algorithm. To achieve this, we introduce a new…
We study the problem of minimizing the average of a large number of smooth convex functions penalized with a strongly convex regularizer. We propose and analyze a novel primal-dual method (Quartz) which at every iteration samples and…
Despite the recent growth of theoretical studies and empirical successes of neural networks, gradient backpropagation is still the most widely used algorithm for training such networks. On the one hand, we have deterministic or full…
We consider the problem of minimizing the sum of three convex functions: i) a smooth function $f$ in the form of an expectation or a finite average, ii) a non-smooth function $g$ in the form of a finite average of proximable functions…
Variance-reduced stochastic gradient methods have gained popularity in recent times. Several variants exist with different strategies for the storing and sampling of gradients and this work concerns the interactions between these two…
We develop and analyze a new family of {\em nonaccelerated and accelerated loopless variance-reduced methods} for finite sum optimization problems. Our convergence analysis relies on a novel expected smoothness condition which upper bounds…
We study sampling problems associated with potentials that lack smoothness. The potentials can be either convex or non-convex. Departing from the standard smooth setting, the potentials are only assumed to be weakly smooth or non-smooth, or…
In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates. SAGA improves on the theory behind SAG…
We propose a novel stochastic smoothing accelerated gradient (SSAG) method for general constrained nonsmooth convex composite optimization, and analyze the convergence rates. The SSAG method allows various smoothing techniques, and can deal…
Finite-sum optimization plays an important role in the area of machine learning, and hence has triggered a surge of interest in recent years. To address this optimization problem, various randomized incremental gradient methods have been…
We present adaptive sequential SAA (sample average approximation) algorithms to solve large-scale two-stage stochastic linear programs. The iterative algorithm framework we propose is organized into \emph{outer} and \emph{inner} iterations…
Over the past ten years, driven by large scale optimisation problems arising from machine learning, the development of stochastic optimisation methods have witnessed a tremendous growth. However, despite their popularity, the theoretical…
We consider saddle point problems which objective functions are the average of $n$ strongly convex-concave individual components. Recently, researchers exploit variance reduction methods to solve such problems and achieve linear-convergence…
Motivated by applications to distributed optimization over networks and large-scale data processing in machine learning, we analyze the deterministic incremental aggregated gradient method for minimizing a finite sum of smooth functions…
We develop and analyze a new algorithm for empirical risk minimization, which is the key paradigm for training supervised machine learning models. Our method---SAGD---is based on a probabilistic interpolation of SAGA and gradient descent…