Related papers: Accelerating Stochastic Gradient Descent Using Ant…

Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling

Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is…

Machine Learning · Statistics 2014-05-14 Peilin Zhao , Tong Zhang

Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling

Machine learning, especially deep neural networks, has been rapidly developed in fields including computer vision, speech recognition and reinforcement learning. Although Mini-batch SGD is one of the most popular stochastic optimization…

Machine Learning · Computer Science 2019-03-12 Xinyu Peng , Li Li , Fei-Yue Wang

Differentiable Antithetic Sampling for Variance Reduction in Stochastic Variational Inference

Stochastic optimization techniques are standard in variational inference algorithms. These methods estimate gradients by approximating expectations with independent Monte Carlo samples. In this paper, we explore a technique that uses…

Machine Learning · Computer Science 2019-08-15 Mike Wu , Noah Goodman , Stefano Ermon

Randomised Splitting Methods and Stochastic Gradient Descent

We explore an explicit link between stochastic gradient descent using common batching strategies and splitting methods for ordinary differential equations. From this perspective, we introduce a new minibatching strategy (called Symmetric…

Optimization and Control · Mathematics 2025-04-08 Luke Shaw , Peter A. Whalley

mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent applied to the problem of minimizing a strongly convex composite function represented as the sum of an…

Machine Learning · Computer Science 2014-10-20 Jakub Konečný , Jie Liu , Peter Richtárik , Martin Takáč

Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization

In this paper, we develop a new accelerated stochastic gradient method for efficiently solving the convex regularized empirical risk minimization problem in mini-batch settings. The use of mini-batches is becoming a golden standard in the…

Optimization and Control · Mathematics 2017-09-20 Tomoya Murata , Taiji Suzuki

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

We propose a new stochastic optimization framework for empirical risk minimization problems such as those that arise in machine learning. The traditional approaches, such as (mini-batch) stochastic gradient descent (SGD), utilize an…

Machine Learning · Statistics 2020-02-04 Kenji Kawaguchi , Haihao Lu

The Practicality of Stochastic Optimization in Imaging Inverse Problems

In this work we investigate the practicality of stochastic gradient descent and recently introduced variants with variance-reduction techniques in imaging inverse problems. Such algorithms have been shown in the machine learning literature…

Optimization and Control · Mathematics 2021-01-26 Junqi Tang , Karen Egiazarian , Mohammad Golbabaee , Mike Davies

Stochastic versus Deterministic in Stochastic Gradient Descent

This paper theoretically reanalyzes the convergence of the mini-batch stochastic gradient descent (SGD) for a structured minimization problem involving a finite-sum function with its gradient being stochastically approximated, and an…

Optimization and Control · Mathematics 2026-04-07 Runze Li , Jintao Xu , Wenxun Xing

A Stochastic Gradient Method with Biased Estimation for Faster Nonconvex Optimization

A number of optimization approaches have been proposed for optimizing nonconvex objectives (e.g. deep learning models), such as batch gradient descent, stochastic gradient descent and stochastic variance reduced gradient descent. Theory…

Machine Learning · Computer Science 2019-05-15 Jia Bi , Steve R. Gunn

Optimal Mini-Batch Size Selection for Fast Gradient Descent

This paper presents a methodology for selecting the mini-batch size that minimizes Stochastic Gradient Descent (SGD) learning time for single and multiple learner problems. By decoupling algorithmic analysis issues from hardware and…

Machine Learning · Computer Science 2019-11-18 Michael P. Perrone , Haidar Khan , Changhoan Kim , Anastasios Kyrillidis , Jerry Quinn , Valentina Salapura

Mini-batch stochastic gradient descent with dynamic sample sizes

We focus on solving constrained convex optimization problems using mini-batch stochastic gradient descent. Dynamic sample size rules are presented which ensure a descent direction with high probability. Empirical results from two…

Optimization and Control · Mathematics 2017-08-03 Michael R. Metel

MBGDT:Robust Mini-Batch Gradient Descent

In high dimensions, most machine learning method perform fragile even there are a little outliers. To address this, we hope to introduce a new method with the base learner, such as Bayesian regression or stochastic gradient descent to solve…

Machine Learning · Computer Science 2022-06-16 Hanming Wang , Haozheng Luo , Yue Wang

Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose mS2GD: a method incorporating a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent (S2GD). We consider the problem of minimizing a strongly convex function…

Machine Learning · Computer Science 2016-04-20 Jakub Konečný , Jie Liu , Peter Richtárik , Martin Takáč

Accelerating Asynchronous Stochastic Gradient Descent for Neural Machine Translation

In order to extract the best possible performance from asynchronous stochastic gradient descent one must increase the mini-batch size and scale the learning rate accordingly. In order to achieve further speedup we introduce a technique that…

Computation and Language · Computer Science 2018-09-17 Nikolay Bogoychev , Marcin Junczys-Dowmunt , Kenneth Heafield , Alham Fikri Aji

Stop Wasting My Gradients: Practical SVRG

We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors…

Machine Learning · Computer Science 2016-08-06 Reza Babanezhad , Mohamed Osama Ahmed , Alim Virani , Mark Schmidt , Jakub Konečný , Scott Sallinen

Better Mini-Batch Algorithms via Accelerated Gradient Methods

Mini-batch algorithms have been proposed as a way to speed-up stochastic convex optimization problems. We study how such algorithms can be improved using accelerated gradient methods. We provide a novel analysis, which shows how standard…

Machine Learning · Computer Science 2011-06-24 Andrew Cotter , Ohad Shamir , Nathan Srebro , Karthik Sridharan

Adaptive Sampling Strategies for Stochastic Optimization

In this paper, we propose a stochastic optimization method that adaptively controls the sample size used in the computation of gradient approximations. Unlike other variance reduction techniques that either require additional storage or the…

Optimization and Control · Mathematics 2017-11-01 Raghu Bollapragada , Richard Byrd , Jorge Nocedal

Bolstering Stochastic Gradient Descent with Model Building

Stochastic gradient descent method and its variants constitute the core optimization algorithms that achieve good convergence rates for solving machine learning problems. These rates are obtained especially when these algorithms are…

Machine Learning · Computer Science 2024-03-14 S. Ilker Birbil , Ozgur Martin , Gonenc Onay , Figen Oztoprak

Stochastic Gradient Descent Meets Distribution Regression

Stochastic gradient descent (SGD) provides a simple and efficient way to solve a broad range of machine learning problems. Here, we focus on distribution regression (DR), involving two stages of sampling: Firstly, we regress from…

Machine Learning · Statistics 2021-03-08 Nicole Mücke