Related papers: Variance Regularization for Accelerating Stochasti…

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the…

Optimization and Control · Mathematics 2017-09-20 Tomoya Murata , Taiji Suzuki

Stochastic batch size for adaptive regularization in deep network optimization

We propose a first-order stochastic optimization algorithm incorporating adaptive regularization applicable to machine learning problems in deep learning framework. The adaptive regularization is imposed by stochastic process in determining…

Machine Learning · Computer Science 2020-04-15 Kensuke Nakamura , Stefano Soatto , Byung-Woo Hong

Stop Wasting My Gradients: Practical SVRG

We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors…

Machine Learning · Computer Science 2016-08-06 Reza Babanezhad , Mohamed Osama Ahmed , Alim Virani , Mark Schmidt , Jakub Konečný , Scott Sallinen

Optimal Rates for Multi-pass Stochastic Gradient Methods

We analyze the learning properties of the stochastic gradient method when multiple passes over the data and mini-batches are allowed. We study how regularization properties are controlled by the step-size, the number of passes and the…

Machine Learning · Computer Science 2019-03-18 Junhong Lin , Lorenzo Rosasco

Implicit Gradient Alignment in Distributed and Federated Learning

A major obstacle to achieving global convergence in distributed and federated learning is the misalignment of gradients across clients, or mini-batches due to heterogeneity and stochasticity of the distributed data. In this work, we show…

Machine Learning · Computer Science 2021-12-14 Yatin Dandi , Luis Barba , Martin Jaggi

A Guide to Stochastic Optimisation for Large-Scale Inverse Problems

Stochastic optimisation algorithms are the de facto standard for machine learning with large amounts of data. Handling only a subset of available data in each optimisation step dramatically reduces the per-iteration computational costs,…

Numerical Analysis · Mathematics 2024-12-19 Matthias J. Ehrhardt , Zeljko Kereta , Jingwei Liang , Junqi Tang

Stochastic dual averaging methods using variance reduction techniques for regularized empirical risk minimization problems

We consider a composite convex minimization problem associated with regularized empirical risk minimization, which often arises in machine learning. We propose two new stochastic gradient methods that are based on stochastic dual averaging…

Optimization and Control · Mathematics 2016-03-09 Tomoya Murata , Taiji Suzuki

Stochastic Majorization-Minimization Algorithms for Large-Scale Optimization

Majorization-minimization algorithms consist of iteratively minimizing a majorizing surrogate of an objective function. Because of its simplicity and its wide applicability, this principle has been very popular in statistics and in signal…

Machine Learning · Statistics 2013-09-11 Julien Mairal

Randomized Smoothing for Stochastic Optimization

We analyze convergence rates of stochastic optimization procedures for non-smooth convex optimization problems. By combining randomized smoothing techniques with accelerated gradient methods, we obtain convergence rates of stochastic…

Optimization and Control · Mathematics 2012-04-10 John C. Duchi , Peter L. Bartlett , Martin J. Wainwright

Obtaining Adjustable Regularization for Free via Iterate Averaging

Regularization for optimization is a crucial technique to avoid overfitting in machine learning. In order to obtain the best performance, we usually train a model by tuning the regularization parameters. It becomes costly, however, when a…

Machine Learning · Computer Science 2020-08-18 Jingfeng Wu , Vladimir Braverman , Lin F. Yang

An Asynchronous Mini-Batch Algorithm for Regularized Stochastic Optimization

Mini-batch optimization has proven to be a powerful paradigm for large-scale learning. However, the state of the art parallel mini-batch algorithms assume synchronous operation or cyclic update orders. When worker nodes are heterogeneous…

Optimization and Control · Mathematics 2015-05-20 Hamid Reza Feyzmahdavian , Arda Aytekin , Mikael Johansson

Stochastic Nonconvex Optimization with Large Minibatches

We study stochastic optimization of nonconvex loss functions, which are typical objectives for training neural networks. We propose stochastic approximation algorithms which optimize a series of regularized, nonlinearized losses on large…

Machine Learning · Computer Science 2019-03-12 Weiran Wang , Nathan Srebro

Accelerating Stochastic Gradient Descent Using Antithetic Sampling

(Mini-batch) Stochastic Gradient Descent is a popular optimization method which has been applied to many machine learning applications. But a rather high variance introduced by the stochastic gradient in each step may slow down the…

Machine Learning · Computer Science 2018-10-09 Jingchang Liu , Linli Xu

Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent

We propose \textit{Meta-Regularization}, a novel approach for the adaptive choice of the learning rate in first-order gradient descent methods. Our approach modifies the objective function by adding a regularization term on the learning…

Machine Learning · Computer Science 2021-04-13 Guangzeng Xie , Hao Jin , Dachao Lin , Zhihua Zhang

Stability and Generalization of Stochastic Optimization with Nonconvex and Nonsmooth Problems

Stochastic optimization has found wide applications in minimizing objective functions in machine learning, which motivates a lot of theoretical studies to understand its practical success. Most of existing studies focus on the convergence…

Artificial Intelligence · Computer Science 2023-07-19 Yunwen Lei

Variance Reduction for Faster Non-Convex Optimization

We consider the fundamental problem in non-convex optimization of efficiently reaching a stationary point. In contrast to the convex case, in the long history of this basic problem, the only known theoretical results on first-order…

Optimization and Control · Mathematics 2016-08-26 Zeyuan Allen-Zhu , Elad Hazan

Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches

Stochastic variance reduction has proven effective at accelerating first-order algorithms for solving convex finite-sum optimization tasks such as empirical risk minimization. Incorporating second-order information has proven helpful in…

Optimization and Control · Mathematics 2025-04-30 Michał Dereziński

On the Regularizing Property of Stochastic Gradient Descent

Stochastic gradient descent is one of the most successful approaches for solving large-scale problems, especially in machine learning and statistics. At each iteration, it employs an unbiased estimator of the full gradient computed from one…

Numerical Analysis · Mathematics 2018-12-05 Bangti Jin , Xiliang Lu

On Stability and Generalization of Bilevel Optimization Problem

(Stochastic) bilevel optimization is a frequently encountered problem in machine learning with a wide range of applications such as meta-learning, hyper-parameter optimization, and reinforcement learning. Most of the existing studies on…

Machine Learning · Computer Science 2023-03-16 Meng Ding , Mingxi Lei , Yunwen Lei , Di Wang , Jinhui Xu

Exploiting Uncertainty of Loss Landscape for Stochastic Optimization

We introduce novel variants of momentum by incorporating the variance of the stochastic loss function. The variance characterizes the confidence or uncertainty of the local features of the averaged loss surface across the i.i.d. subsets of…

Machine Learning · Computer Science 2019-05-31 Vineeth S. Bhaskara , Sneha Desai