Related papers: Active Mini-Batch Sampling using Repulsive Point P…

Determinantal Point Processes for Mini-Batch Diversification

We study a mini-batch diversification scheme for stochastic gradient descent (SGD). While classical SGD relies on uniformly sampling data points to form a mini-batch, we propose a non-uniform sampling scheme based on the Determinantal Point…

Machine Learning · Computer Science 2017-09-12 Cheng Zhang , Hedvig Kjellstrom , Stephan Mandt

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Stochastic gradient descent (SGD) is a cornerstone of machine learning. When the number N of data items is large, SGD relies on constructing an unbiased estimator of the gradient of the empirical risk using a small subset of the original…

Machine Learning · Statistics 2021-12-14 Remi Bardenet , Subhro Ghosh , Meixia Lin

Accelerating Minibatch Stochastic Gradient Descent using Stratified Sampling

Stochastic Gradient Descent (SGD) is a popular optimization method which has been applied to many important machine learning tasks such as Support Vector Machines and Deep Neural Networks. In order to parallelize SGD, minibatch training is…

Machine Learning · Statistics 2014-05-14 Peilin Zhao , Tong Zhang

Efficient Sampling for k-Determinantal Point Processes

Determinantal Point Processes (DPPs) are elegant probabilistic models of repulsion and diversity over discrete sets of items. But their applicability to large sets is hindered by expensive cubic-complexity matrix operations for basic tasks…

Machine Learning · Computer Science 2016-05-31 Chengtao Li , Stefanie Jegelka , Suvrit Sra

mS2GD: Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent applied to the problem of minimizing a strongly convex composite function represented as the sum of an…

Machine Learning · Computer Science 2014-10-20 Jakub Konečný , Jie Liu , Peter Richtárik , Martin Takáč

Adaptive Sketches for Robust Regression with Importance Sampling

We introduce data structures for solving robust regression through stochastic gradient descent (SGD) by sampling gradients with probability proportional to their norm, i.e., importance sampling. Although SGD is widely used for large scale…

Machine Learning · Computer Science 2022-07-19 Sepideh Mahabadi , David P. Woodruff , Samson Zhou

Accelerating Minibatch Stochastic Gradient Descent using Typicality Sampling

Machine learning, especially deep neural networks, has been rapidly developed in fields including computer vision, speech recognition and reinforcement learning. Although Mini-batch SGD is one of the most popular stochastic optimization…

Machine Learning · Computer Science 2019-03-12 Xinyu Peng , Li Li , Fei-Yue Wang

Accelerating Stochastic Gradient Descent Using Antithetic Sampling

(Mini-batch) Stochastic Gradient Descent is a popular optimization method which has been applied to many machine learning applications. But a rather high variance introduced by the stochastic gradient in each step may slow down the…

Machine Learning · Computer Science 2018-10-09 Jingchang Liu , Linli Xu

Approximate Inference in Continuous Determinantal Point Processes

Determinantal point processes (DPPs) are random point processes well-suited for modeling repulsion. In machine learning, the focus of DPP-based models has been on diverse subset selection from a discrete and finite base set. This discrete…

Machine Learning · Statistics 2013-11-14 Raja Hafiz Affandi , Emily B. Fox , Ben Taskar

Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting

We propose mS2GD: a method incorporating a mini-batching scheme for improving the theoretical complexity and practical performance of semi-stochastic gradient descent (S2GD). We consider the problem of minimizing a strongly convex function…

Machine Learning · Computer Science 2016-04-20 Jakub Konečný , Jie Liu , Peter Richtárik , Martin Takáč

Batched Stochastic Gradient Descent with Weighted Sampling

We analyze a batched variant of Stochastic Gradient Descent (SGD) with weighted sampling distribution for smooth and non-smooth objective functions. We show that by distributing the batches computationally, a significant speedup in the…

Numerical Analysis · Mathematics 2017-03-02 Deanna Needell , Rachel Ward

Safe Adaptive Importance Sampling

Importance sampling has become an indispensable strategy to speed up optimization algorithms for large-scale applications. Improved adaptive variants - using importance values defined by the complete gradient information which changes…

Machine Learning · Computer Science 2017-11-08 Sebastian U. Stich , Anant Raj , Martin Jaggi

Stochastic versus Deterministic in Stochastic Gradient Descent

This paper theoretically reanalyzes the convergence of the mini-batch stochastic gradient descent (SGD) for a structured minimization problem involving a finite-sum function with its gradient being stochastically approximated, and an…

Optimization and Control · Mathematics 2026-04-07 Runze Li , Jintao Xu , Wenxun Xing

On the diffusion approximation of nonconvex stochastic gradient descent

We study the Stochastic Gradient Descent (SGD) method in nonconvex optimization problems from the point of view of approximating diffusion processes. We prove rigorously that the diffusion process can approximate the SGD algorithm weakly…

Machine Learning · Statistics 2018-03-06 Wenqing Hu , Chris Junchi Li , Lei Li , Jian-Guo Liu

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems

Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a…

Machine Learning · Computer Science 2015-02-11 Christopher De Sa , Kunle Olukotun , Christopher Ré

Data Sampling Affects the Complexity of Online SGD over Dependent Data

Conventional machine learning applications typically assume that data samples are independently and identically distributed (i.i.d.). However, practical scenarios often involve a data-generating process that produces highly dependent data…

Machine Learning · Computer Science 2022-04-04 Shaocong Ma , Ziyi Chen , Yi Zhou , Kaiyi Ji , Yingbin Liang

Stochastic Gradient Descent Meets Distribution Regression

Stochastic gradient descent (SGD) provides a simple and efficient way to solve a broad range of machine learning problems. Here, we focus on distribution regression (DR), involving two stages of sampling: Firstly, we regress from…

Machine Learning · Statistics 2021-03-08 Nicole Mücke

GDPP: Learning Diverse Generations Using Determinantal Point Process

Generative models have proven to be an outstanding tool for representing high-dimensional probability distributions and generating realistic-looking images. An essential characteristic of generative models is their ability to produce…

Machine Learning · Computer Science 2019-11-26 Mohamed Elfeki , Camille Couprie , Morgane Riviere , Mohamed Elhoseiny

Gradient Diversity: a Key Ingredient for Scalable Distributed Learning

It has been experimentally observed that distributed implementations of mini-batch stochastic gradient descent (SGD) algorithms exhibit speedup saturation and decaying generalization ability beyond a particular batch-size. In this work, we…

Machine Learning · Computer Science 2018-01-09 Dong Yin , Ashwin Pananjady , Max Lam , Dimitris Papailiopoulos , Kannan Ramchandran , Peter Bartlett

RD-DPP: Rate-Distortion Theory Meets Determinantal Point Process to Diversify Learning Data Samples

In some practical learning tasks, such as traffic video analysis, the number of available training samples is restricted by different factors, such as limited communication bandwidth and computation power. Determinantal Point Process (DPP)…

Machine Learning · Computer Science 2023-08-17 Xiwen Chen , Huayu Li , Rahul Amin , Abolfazl Razi